Reference documentation and code samples for the Discovery Engine V1BETA API class Google::Cloud::DiscoveryEngine::V1beta::TrainCustomModelRequest::GcsTrainingInput.
Cloud Storage training data input.
Inherits
Object
Extended By
Google::Protobuf::MessageExts::ClassMethods
Includes
Google::Protobuf::MessageExts
Methods
#corpus_data_path
defcorpus_data_path()->::String
Returns
(::String) — The Cloud Storage corpus data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id, title
and text. Example:
{"_id": "doc1", title: "relevant doc", "text": "relevant text"}
#corpus_data_path=
defcorpus_data_path=(value)->::String
Parameter
value (::String) — The Cloud Storage corpus data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id, title
and text. Example:
{"_id": "doc1", title: "relevant doc", "text": "relevant text"}
Returns
(::String) — The Cloud Storage corpus data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id, title
and text. Example:
{"_id": "doc1", title: "relevant doc", "text": "relevant text"}
#query_data_path
defquery_data_path()->::String
Returns
(::String) — The gcs query data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id
and text. Example: {"_id": "query1", "text": "example query"}
#query_data_path=
defquery_data_path=(value)->::String
Parameter
value (::String) — The gcs query data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id
and text. Example: {"_id": "query1", "text": "example query"}
Returns
(::String) — The gcs query data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id
and text. Example: {"_id": "query1", "text": "example query"}
#test_data_path
deftest_data_path()->::String
Returns
(::String) — Cloud Storage test data. Same format as train_data_path. If not provided,
a random 80/20 train/test split will be performed on train_data_path.
#test_data_path=
deftest_data_path=(value)->::String
Parameter
value (::String) — Cloud Storage test data. Same format as train_data_path. If not provided,
a random 80/20 train/test split will be performed on train_data_path.
Returns
(::String) — Cloud Storage test data. Same format as train_data_path. If not provided,
a random 80/20 train/test split will be performed on train_data_path.
#train_data_path
deftrain_data_path()->::String
Returns
(::String) —
Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv
format. Each line should have the doc_id and query_id and score (number).
For search-tuning model, it should have the query-id corpus-id
score as tsv file header. The score should be a number in [0, inf+).
The larger the number is, the more relevant the pair is. Example:
query-id\tcorpus-id\tscore
query1\tdoc1\t1
#train_data_path=
deftrain_data_path=(value)->::String
Parameter
value (::String) —
Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv
format. Each line should have the doc_id and query_id and score (number).
For search-tuning model, it should have the query-id corpus-id
score as tsv file header. The score should be a number in [0, inf+).
The larger the number is, the more relevant the pair is. Example:
query-id\tcorpus-id\tscore
query1\tdoc1\t1
Returns
(::String) —
Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv
format. Each line should have the doc_id and query_id and score (number).
For search-tuning model, it should have the query-id corpus-id
score as tsv file header. The score should be a number in [0, inf+).
The larger the number is, the more relevant the pair is. Example:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[],[],null,["# Discovery Engine V1BETA API - Class Google::Cloud::DiscoveryEngine::V1beta::TrainCustomModelRequest::GcsTrainingInput (v0.21.0)\n\nVersion latestkeyboard_arrow_down\n\n- [0.21.0 (latest)](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/latest/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.20.1](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.20.1/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.19.1](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.19.1/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.18.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.18.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.17.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.17.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.16.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.16.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.15.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.15.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.14.2](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.14.2/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.13.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.13.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.12.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.12.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.11.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.11.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.10.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.10.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.9.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.9.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.8.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.8.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.7.2](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.7.2/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.6.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.6.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.5.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.5.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.4.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.4.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.3.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.3.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.2.1](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.2.1/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput)\n- [0.1.0](/ruby/docs/reference/google-cloud-discovery_engine-v1beta/0.1.0/Google-Cloud-DiscoveryEngine-V1beta-TrainCustomModelRequest-GcsTrainingInput) \nReference documentation and code samples for the Discovery Engine V1BETA API class Google::Cloud::DiscoveryEngine::V1beta::TrainCustomModelRequest::GcsTrainingInput.\n\nCloud Storage training data input. \n\nInherits\n--------\n\n- Object \n\nExtended By\n-----------\n\n- Google::Protobuf::MessageExts::ClassMethods \n\nIncludes\n--------\n\n- Google::Protobuf::MessageExts\n\nMethods\n-------\n\n### #corpus_data_path\n\n def corpus_data_path() -\u003e ::String\n\n**Returns**\n\n- (::String) --- The Cloud Storage corpus data which could be associated in train data. The data path format is `gs://\u003cbucket_to_data\u003e/\u003cjsonl_file_name\u003e`. A newline delimited jsonl/ndjson file.\n\n\n For search-tuning model, each line should have the _id, title\n and text. Example:\n `{\"_id\": \"doc1\", title: \"relevant doc\", \"text\": \"relevant text\"}`\n\n### #corpus_data_path=\n\n def corpus_data_path=(value) -\u003e ::String\n\n**Parameter**\n\n- **value** (::String) --- The Cloud Storage corpus data which could be associated in train data. The data path format is `gs://\u003cbucket_to_data\u003e/\u003cjsonl_file_name\u003e`. A newline delimited jsonl/ndjson file.\n\n\n For search-tuning model, each line should have the _id, title\n and text. Example:\n`{\"_id\": \"doc1\", title: \"relevant doc\", \"text\": \"relevant text\"}` \n**Returns**\n\n- (::String) --- The Cloud Storage corpus data which could be associated in train data. The data path format is `gs://\u003cbucket_to_data\u003e/\u003cjsonl_file_name\u003e`. A newline delimited jsonl/ndjson file.\n\n\n For search-tuning model, each line should have the _id, title\n and text. Example:\n `{\"_id\": \"doc1\", title: \"relevant doc\", \"text\": \"relevant text\"}`\n\n### #query_data_path\n\n def query_data_path() -\u003e ::String\n\n**Returns**\n\n- (::String) --- The gcs query data which could be associated in train data. The data path format is `gs://\u003cbucket_to_data\u003e/\u003cjsonl_file_name\u003e`. A newline delimited jsonl/ndjson file.\n\n\n For search-tuning model, each line should have the _id\n and text. Example: {\"_id\": \"query1\", \"text\": \"example query\"}\n\n### #query_data_path=\n\n def query_data_path=(value) -\u003e ::String\n\n**Parameter**\n\n- **value** (::String) --- The gcs query data which could be associated in train data. The data path format is `gs://\u003cbucket_to_data\u003e/\u003cjsonl_file_name\u003e`. A newline delimited jsonl/ndjson file.\n\n\n For search-tuning model, each line should have the _id\nand text. Example: {\"_id\": \"query1\", \"text\": \"example query\"} \n**Returns**\n\n- (::String) --- The gcs query data which could be associated in train data. The data path format is `gs://\u003cbucket_to_data\u003e/\u003cjsonl_file_name\u003e`. A newline delimited jsonl/ndjson file.\n\n\n For search-tuning model, each line should have the _id\n and text. Example: {\"_id\": \"query1\", \"text\": \"example query\"}\n\n### #test_data_path\n\n def test_data_path() -\u003e ::String\n\n**Returns**\n\n- (::String) --- Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.\n\n### #test_data_path=\n\n def test_data_path=(value) -\u003e ::String\n\n**Parameter**\n\n- **value** (::String) --- Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path. \n**Returns**\n\n- (::String) --- Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.\n\n### #train_data_path\n\n def train_data_path() -\u003e ::String\n\n**Returns**\n\n- (::String) --- Cloud Storage training data path whose format should be\n `gs://\u003cbucket_to_data\u003e/\u003ctsv_file_name\u003e`. The file should be in tsv\n format. Each line should have the doc_id and query_id and score (number).\n\n For search-tuning model, it should have the query-id corpus-id\n score as tsv file header. The score should be a number in `[0, inf+)`.\n The larger the number is, the more relevant the pair is. Example:\n - `query-id\\tcorpus-id\\tscore`\n - `query1\\tdoc1\\t1`\n\n### #train_data_path=\n\n def train_data_path=(value) -\u003e ::String\n\n**Parameter**\n\n- **value** (::String) ---\n\n Cloud Storage training data path whose format should be\n `gs://\u003cbucket_to_data\u003e/\u003ctsv_file_name\u003e`. The file should be in tsv\n format. Each line should have the doc_id and query_id and score (number).\n\n For search-tuning model, it should have the query-id corpus-id\n score as tsv file header. The score should be a number in `[0, inf+)`.\n The larger the number is, the more relevant the pair is. Example:\n - `query-id\\tcorpus-id\\tscore`\n- `query1\\tdoc1\\t1` \n**Returns**\n\n- (::String) --- Cloud Storage training data path whose format should be\n `gs://\u003cbucket_to_data\u003e/\u003ctsv_file_name\u003e`. The file should be in tsv\n format. Each line should have the doc_id and query_id and score (number).\n\n For search-tuning model, it should have the query-id corpus-id\n score as tsv file header. The score should be a number in `[0, inf+)`.\n The larger the number is, the more relevant the pair is. Example:\n - `query-id\\tcorpus-id\\tscore`\n - `query1\\tdoc1\\t1`"]]