Required. The resource name of the Data Store, such as projects/*/locations/global/collections/default_collection/dataStores/default_data_store. This field is used to identify the data store where to train the models.
Request body
The request body contains data with the following structure:
JSON representation
{"modelType": string,"errorConfig": {object (ImportErrorConfig)},"modelId": string,// Union field training_input can be only one of the following:"gcsTrainingInput": {object (GcsTrainingInput)}// End of list of possible types for union field training_input.}
Fields
modelType
string
Model to be trained. Supported values are:
search-tuning: Fine tuning the search system based on data provided.
The Cloud Storage corpus data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the Id, title and text. Example: {"Id": "doc1", title: "relevant doc", "text": "relevant text"}
queryDataPath
string
The gcs query data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the Id and text. Example: {"Id": "query1", "text": "example query"}
trainDataPath
string
Cloud Storage training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the docId and queryId and score (number).
For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example:
query-id\tcorpus-id\tscore
query1\tdoc1\t1
testDataPath
string
Cloud Storage test data. Same format as trainDataPath. If not provided, a random 80/20 train/test split will be performed on trainDataPath.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-06-27 UTC."],[[["\u003cp\u003eThis API endpoint trains a custom model for a specified data store using a POST request to the provided URL: \u003ccode\u003ehttps://discoveryengine.googleapis.com/v1beta/{dataStore=projects/*/locations/*/collections/*/dataStores/*}:trainCustomModel\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eThe request requires a \u003ccode\u003edataStore\u003c/code\u003e path parameter to identify where the model will be trained, and the request body can specify parameters like \u003ccode\u003emodelType\u003c/code\u003e, \u003ccode\u003eerrorConfig\u003c/code\u003e, and \u003ccode\u003emodelId\u003c/code\u003e, as well as the Cloud Storage location of training data via the \u003ccode\u003egcsTrainingInput\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003egcsTrainingInput\u003c/code\u003e defines the input data structure for training and allows for corpus, query, training, and test data paths to be specified as newline delimited JSON or TSV files in Google Cloud Storage.\u003c/p\u003e\n"],["\u003cp\u003eSuccessful training requests return an \u003ccode\u003eOperation\u003c/code\u003e instance, and the API requires the OAuth scope \u003ccode\u003ehttps://www.googleapis.com/auth/cloud-platform\u003c/code\u003e along with the IAM permission \u003ccode\u003ediscoveryengine.dataStores.trainCustomModel\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003emodelType\u003c/code\u003e field is used to specify the type of training to perform, with \u003ccode\u003esearch-tuning\u003c/code\u003e as a supported value, which fine tunes the search system.\u003c/p\u003e\n"]]],[],null,["# Method: projects.locations.collections.dataStores.trainCustomModel\n\n- [HTTP request](#body.HTTP_TEMPLATE)\n- [Path parameters](#body.PATH_PARAMETERS)\n- [Request body](#body.request_body)\n - [JSON representation](#body.request_body.SCHEMA_REPRESENTATION)\n- [Response body](#body.response_body)\n- [Authorization scopes](#body.aspect)\n- [IAM Permissions](#body.aspect_1)\n- [GcsTrainingInput](#GcsTrainingInput)\n - [JSON representation](#GcsTrainingInput.SCHEMA_REPRESENTATION)\n\nTrains a custom model.\n\n### HTTP request\n\n`POST https://discoveryengine.googleapis.com/v1beta/{dataStore=projects/*/locations/*/collections/*/dataStores/*}:trainCustomModel`\n\nThe URL uses [gRPC Transcoding](https://google.aip.dev/127) syntax.\n\n### Path parameters\n\n### Request body\n\nThe request body contains data with the following structure:\n\n### Response body\n\nIf successful, the response body contains an instance of [Operation](/generative-ai-app-builder/docs/reference/rest/Shared.Types/ListOperationsResponse#Operation).\n\n### Authorization scopes\n\nRequires the following OAuth scope:\n\n- `https://www.googleapis.com/auth/cloud-platform`\n\nFor more information, see the [Authentication Overview](/docs/authentication#authorization-gcp).\n\n### IAM Permissions\n\nRequires the following [IAM](https://cloud.google.com/iam/docs) permission on the `dataStore` resource:\n\n- `discoveryengine.dataStores.trainCustomModel`\n\nFor more information, see the [IAM documentation](https://cloud.google.com/iam/docs).\n\nGcsTrainingInput\n----------------\n\nCloud Storage training data input."]]