Reference documentation and code samples for the Discovery Engine V1 API class Google::Cloud::DiscoveryEngine::V1::TrainCustomModelRequest::GcsTrainingInput.
Cloud Storage training data input.
Inherits
- Object
Extended By
- Google::Protobuf::MessageExts::ClassMethods
Includes
- Google::Protobuf::MessageExts
Methods
#corpus_data_path
def corpus_data_path() -> ::String
-
(::String) — The Cloud Storage corpus data which could be associated in train data.
The data path format is
gs://<bucket_to_data>/<jsonl_file_name>
. A newline delimited jsonl/ndjson file.For search-tuning model, each line should have the _id, title and text. Example:
{"_id": "doc1", title: "relevant doc", "text": "relevant text"}
#corpus_data_path=
def corpus_data_path=(value) -> ::String
-
value (::String) — The Cloud Storage corpus data which could be associated in train data.
The data path format is
gs://<bucket_to_data>/<jsonl_file_name>
. A newline delimited jsonl/ndjson file.For search-tuning model, each line should have the _id, title and text. Example:
{"_id": "doc1", title: "relevant doc", "text": "relevant text"}
-
(::String) — The Cloud Storage corpus data which could be associated in train data.
The data path format is
gs://<bucket_to_data>/<jsonl_file_name>
. A newline delimited jsonl/ndjson file.For search-tuning model, each line should have the _id, title and text. Example:
{"_id": "doc1", title: "relevant doc", "text": "relevant text"}
#query_data_path
def query_data_path() -> ::String
-
(::String) — The gcs query data which could be associated in train data.
The data path format is
gs://<bucket_to_data>/<jsonl_file_name>
. A newline delimited jsonl/ndjson file.For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}
#query_data_path=
def query_data_path=(value) -> ::String
-
value (::String) — The gcs query data which could be associated in train data.
The data path format is
gs://<bucket_to_data>/<jsonl_file_name>
. A newline delimited jsonl/ndjson file.For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}
-
(::String) — The gcs query data which could be associated in train data.
The data path format is
gs://<bucket_to_data>/<jsonl_file_name>
. A newline delimited jsonl/ndjson file.For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}
#test_data_path
def test_data_path() -> ::String
- (::String) — Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.
#test_data_path=
def test_data_path=(value) -> ::String
- value (::String) — Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.
- (::String) — Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.
#train_data_path
def train_data_path() -> ::String
-
(::String) —
Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>
. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in
[0, inf+)
. The larger the number is, the more relevant the pair is. Example:query-id\tcorpus-id\tscore
query1\tdoc1\t1
#train_data_path=
def train_data_path=(value) -> ::String
-
value (::String) —
Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>
. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in
[0, inf+)
. The larger the number is, the more relevant the pair is. Example:query-id\tcorpus-id\tscore
query1\tdoc1\t1
-
(::String) —
Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>
. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in
[0, inf+)
. The larger the number is, the more relevant the pair is. Example:query-id\tcorpus-id\tscore
query1\tdoc1\t1