Reference documentation and code samples for the Google Cloud Discovery Engine V1 Client class GcsTrainingInput.
Cloud Storage training data input.
Generated from protobuf message google.cloud.discoveryengine.v1.TrainCustomModelRequest.GcsTrainingInput
Namespace
Google \ Cloud \ DiscoveryEngine \ V1 \ TrainCustomModelRequest
Methods
__construct
Constructor.
Parameters
Name
Description
data
array
Optional. Data for populating the Message object.
↳ corpus_data_path
string
The Cloud Storage corpus data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file. For search-tuning model, each line should have the _id, title and text. Example: {"_id": "doc1", title: "relevant doc", "text": "relevant text"}
↳ query_data_path
string
The gcs query data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file. For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}
↳ train_data_path
string
Cloud Storage training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the doc_id and query_id and score (number). For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example: * * query-id\tcorpus-id\tscore * * query1\tdoc1\t1
↳ test_data_path
string
Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.
getCorpusDataPath
The Cloud Storage corpus data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id, title
and text. Example:
{"_id": "doc1", title: "relevant doc", "text": "relevant text"}
Returns
Type
Description
string
setCorpusDataPath
The Cloud Storage corpus data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id, title
and text. Example:
{"_id": "doc1", title: "relevant doc", "text": "relevant text"}
Parameter
Name
Description
var
string
Returns
Type
Description
$this
getQueryDataPath
The gcs query data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id
and text. Example: {"_id": "query1", "text": "example query"}
Returns
Type
Description
string
setQueryDataPath
The gcs query data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id
and text. Example: {"_id": "query1", "text": "example query"}
Parameter
Name
Description
var
string
Returns
Type
Description
$this
getTrainDataPath
Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv
format. Each line should have the doc_id and query_id and score (number).
For search-tuning model, it should have the query-id corpus-id
score as tsv file header. The score should be a number in [0, inf+).
The larger the number is, the more relevant the pair is. Example:
query-id\tcorpus-id\tscore
query1\tdoc1\t1
Returns
Type
Description
string
setTrainDataPath
Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv
format. Each line should have the doc_id and query_id and score (number).
For search-tuning model, it should have the query-id corpus-id
score as tsv file header. The score should be a number in [0, inf+).
The larger the number is, the more relevant the pair is. Example:
query-id\tcorpus-id\tscore
query1\tdoc1\t1
Parameter
Name
Description
var
string
Returns
Type
Description
$this
getTestDataPath
Cloud Storage test data. Same format as train_data_path. If not provided,
a random 80/20 train/test split will be performed on train_data_path.
Returns
Type
Description
string
setTestDataPath
Cloud Storage test data. Same format as train_data_path. If not provided,
a random 80/20 train/test split will be performed on train_data_path.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[],[],null,["# Google Cloud Discovery Engine V1 Client - Class GcsTrainingInput (1.7.0)\n\nVersion latestkeyboard_arrow_down\n\n- [1.7.0 (latest)](/php/docs/reference/cloud-discoveryengine/latest/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [1.6.1](/php/docs/reference/cloud-discoveryengine/1.6.1/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [1.5.1](/php/docs/reference/cloud-discoveryengine/1.5.1/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [1.4.0](/php/docs/reference/cloud-discoveryengine/1.4.0/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [1.3.3](/php/docs/reference/cloud-discoveryengine/1.3.3/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [1.2.0](/php/docs/reference/cloud-discoveryengine/1.2.0/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [1.1.0](/php/docs/reference/cloud-discoveryengine/1.1.0/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [1.0.0](/php/docs/reference/cloud-discoveryengine/1.0.0/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [0.11.3](/php/docs/reference/cloud-discoveryengine/0.11.3/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [0.8.0](/php/docs/reference/cloud-discoveryengine/0.8.0/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [0.7.1](/php/docs/reference/cloud-discoveryengine/0.7.1/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [0.6.0](/php/docs/reference/cloud-discoveryengine/0.6.0/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [0.5.0](/php/docs/reference/cloud-discoveryengine/0.5.0/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [0.4.0](/php/docs/reference/cloud-discoveryengine/0.4.0/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [0.3.0](/php/docs/reference/cloud-discoveryengine/0.3.0/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [0.2.0](/php/docs/reference/cloud-discoveryengine/0.2.0/V1.TrainCustomModelRequest.GcsTrainingInput)\n- [0.1.1](/php/docs/reference/cloud-discoveryengine/0.1.1/V1.TrainCustomModelRequest.GcsTrainingInput) \nReference documentation and code samples for the Google Cloud Discovery Engine V1 Client class GcsTrainingInput.\n\nCloud Storage training data input.\n\nGenerated from protobuf message `google.cloud.discoveryengine.v1.TrainCustomModelRequest.GcsTrainingInput`\n\nNamespace\n---------\n\nGoogle \\\\ Cloud \\\\ DiscoveryEngine \\\\ V1 \\\\ TrainCustomModelRequest\n\nMethods\n-------\n\n### __construct\n\nConstructor.\n\n### getCorpusDataPath\n\nThe Cloud Storage corpus data which could be associated in train data.\n\nThe data path format is `gs://\u003cbucket_to_data\u003e/\u003cjsonl_file_name\u003e`.\nA newline delimited jsonl/ndjson file.\nFor search-tuning model, each line should have the _id, title\nand text. Example:\n`{\"_id\": \"doc1\", title: \"relevant doc\", \"text\": \"relevant text\"}`\n\n### setCorpusDataPath\n\nThe Cloud Storage corpus data which could be associated in train data.\n\nThe data path format is `gs://\u003cbucket_to_data\u003e/\u003cjsonl_file_name\u003e`.\nA newline delimited jsonl/ndjson file.\nFor search-tuning model, each line should have the _id, title\nand text. Example:\n`{\"_id\": \"doc1\", title: \"relevant doc\", \"text\": \"relevant text\"}`\n\n### getQueryDataPath\n\nThe gcs query data which could be associated in train data.\n\nThe data path format is `gs://\u003cbucket_to_data\u003e/\u003cjsonl_file_name\u003e`.\nA newline delimited jsonl/ndjson file.\nFor search-tuning model, each line should have the _id\nand text. Example: {\"_id\": \"query1\", \"text\": \"example query\"}\n\n### setQueryDataPath\n\nThe gcs query data which could be associated in train data.\n\nThe data path format is `gs://\u003cbucket_to_data\u003e/\u003cjsonl_file_name\u003e`.\nA newline delimited jsonl/ndjson file.\nFor search-tuning model, each line should have the _id\nand text. Example: {\"_id\": \"query1\", \"text\": \"example query\"}\n\n### getTrainDataPath\n\nCloud Storage training data path whose format should be\n`gs://\u003cbucket_to_data\u003e/\u003ctsv_file_name\u003e`. The file should be in tsv\nformat. Each line should have the doc_id and query_id and score (number).\n\nFor search-tuning model, it should have the query-id corpus-id\nscore as tsv file header. The score should be a number in `[0, inf+)`.\nThe larger the number is, the more relevant the pair is. Example:\n\n- `query-id\\tcorpus-id\\tscore`\n- `query1\\tdoc1\\t1`\n\n### setTrainDataPath\n\nCloud Storage training data path whose format should be\n`gs://\u003cbucket_to_data\u003e/\u003ctsv_file_name\u003e`. The file should be in tsv\nformat. Each line should have the doc_id and query_id and score (number).\n\nFor search-tuning model, it should have the query-id corpus-id\nscore as tsv file header. The score should be a number in `[0, inf+)`.\nThe larger the number is, the more relevant the pair is. Example:\n\n- `query-id\\tcorpus-id\\tscore`\n- `query1\\tdoc1\\t1`\n\n### getTestDataPath\n\nCloud Storage test data. Same format as train_data_path. If not provided,\na random 80/20 train/test split will be performed on train_data_path.\n\n### setTestDataPath\n\nCloud Storage test data. Same format as train_data_path. If not provided,\na random 80/20 train/test split will be performed on train_data_path."]]