Required. The resource name of the Data Store, such as projects/*/locations/global/collections/default_collection/dataStores/default_data_store. This field is used to identify the data store where to train the models.
Request body
The request body contains data with the following structure:
JSON representation
{"modelType": string,"errorConfig": {object (ImportErrorConfig)},"modelId": string,// Union field training_input can be only one of the following:"gcsTrainingInput": {object (GcsTrainingInput)}// End of list of possible types for union field training_input.}
Fields
modelType
string
Model to be trained. Supported values are:
search-tuning: Fine tuning the search system based on data provided.
The Cloud Storage corpus data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the Id, title and text. Example: {"Id": "doc1", title: "relevant doc", "text": "relevant text"}
queryDataPath
string
The gcs query data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the Id and text. Example: {"Id": "query1", "text": "example query"}
trainDataPath
string
Cloud Storage training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the docId and queryId and score (number).
For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example:
query-id\tcorpus-id\tscore
query1\tdoc1\t1
testDataPath
string
Cloud Storage test data. Same format as trainDataPath. If not provided, a random 80/20 train/test split will be performed on trainDataPath.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-06-27 UTC."],[[["\u003cp\u003eThis document describes how to train a custom model using the \u003ccode\u003etrainCustomModel\u003c/code\u003e API endpoint, which supports the \u003ccode\u003esearch-tuning\u003c/code\u003e model type for fine-tuning the search system.\u003c/p\u003e\n"],["\u003cp\u003eThe HTTP request is a \u003ccode\u003ePOST\u003c/code\u003e request to a specific URL that includes a \u003ccode\u003edataStore\u003c/code\u003e path parameter, which identifies the resource where to train the custom model.\u003c/p\u003e\n"],["\u003cp\u003eThe request body must contain data with parameters such as \u003ccode\u003emodelType\u003c/code\u003e, \u003ccode\u003eerrorConfig\u003c/code\u003e, \u003ccode\u003emodelId\u003c/code\u003e, and one of the \u003ccode\u003etraining_input\u003c/code\u003e options, with \u003ccode\u003egcsTrainingInput\u003c/code\u003e being a supported way to provide Cloud Storage training data.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003egcsTrainingInput\u003c/code\u003e object defines the paths for corpus data, query data, training data, and test data within Cloud Storage, and expects data in JSONL or TSV formats with specific fields depending on the model.\u003c/p\u003e\n"],["\u003cp\u003eTo successfully use this API, you must have the OAuth scope \u003ccode\u003ehttps://www.googleapis.com/auth/cloud-platform\u003c/code\u003e and the IAM permission \u003ccode\u003ediscoveryengine.dataStores.trainCustomModel\u003c/code\u003e on the \u003ccode\u003edataStore\u003c/code\u003e resource.\u003c/p\u003e\n"]]],[],null,["# Method: projects.locations.collections.dataStores.trainCustomModel\n\n- [HTTP request](#body.HTTP_TEMPLATE)\n- [Path parameters](#body.PATH_PARAMETERS)\n- [Request body](#body.request_body)\n - [JSON representation](#body.request_body.SCHEMA_REPRESENTATION)\n- [Response body](#body.response_body)\n- [Authorization scopes](#body.aspect)\n- [IAM Permissions](#body.aspect_1)\n- [GcsTrainingInput](#GcsTrainingInput)\n - [JSON representation](#GcsTrainingInput.SCHEMA_REPRESENTATION)\n\nTrains a custom model.\n\n### HTTP request\n\n`POST https://discoveryengine.googleapis.com/v1/{dataStore=projects/*/locations/*/collections/*/dataStores/*}:trainCustomModel`\n\nThe URL uses [gRPC Transcoding](https://google.aip.dev/127) syntax.\n\n### Path parameters\n\n### Request body\n\nThe request body contains data with the following structure:\n\n### Response body\n\nIf successful, the response body contains an instance of [Operation](/generative-ai-app-builder/docs/reference/rest/Shared.Types/ListOperationsResponse#Operation).\n\n### Authorization scopes\n\nRequires the following OAuth scope:\n\n- `https://www.googleapis.com/auth/cloud-platform`\n\nFor more information, see the [Authentication Overview](/docs/authentication#authorization-gcp).\n\n### IAM Permissions\n\nRequires the following [IAM](https://cloud.google.com/iam/docs) permission on the `dataStore` resource:\n\n- `discoveryengine.dataStores.trainCustomModel`\n\nFor more information, see the [IAM documentation](https://cloud.google.com/iam/docs).\n\nGcsTrainingInput\n----------------\n\nCloud Storage training data input."]]