This preview documentation is deprecated as of October 27, 2023. For GA documentation, go to the Vertex AI Search documentation.
Changes in GA:
Name:Discovery for Media is renamed to Vertex AI Search for media. Vertex AI Search includes media recommendations and media search.
Google Cloud Console page: Use the Agent Builder page in the console. The Discovery Engine console page is deprecated.
API reference: Continue to use the discoveryengine.googleapis.com service. The API remains the same but the documentation has moved. Go to the up-to-date, GA version of the Discovery Engine API reference in the Vertex AI Search documentation.
Required. Cloud Storage URIs to input files. URI can be up to 2000 characters long. URIs can match the full object path (for example, gs://bucket/directory/object.json) or a pattern matching one or more files, such as gs://bucket/directory/*.json.
A request can contain at most 100 files (or 100,000 files if dataSchema is content). Each file can be up to 2 GB (or 100 MB if dataSchema is content).
dataSchema
string
The schema to use when parsing the data from the source.
Supported values for document imports:
document (default): One JSON Document per line. Each document must have a valid Document.id.
content: Unstructured data (e.g. PDF, HTML). Each file matched by inputUris becomes a document, with the ID set to the first 128 bits of SHA256(URI) encoded as a hex string.
custom: One custom data JSON per row in arbitrary format that conforms to the defined Schema of the data store. This can only be used by Gen App Builder.
csv: A CSV file with header conforming to the defined Schema of the data store. Each entry after the header is imported as a Document. This can only be used by Gen App Builder.
Supported values for user even imports:
user_event (default): One JSON UserEvent per line.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-02-13 UTC."],[[["Input content is specified using Cloud Storage URIs in a JSON format, with the structure including `inputUris` and `dataSchema` fields."],["The `inputUris` field is a required string array that lists the Cloud Storage URIs of the input files, which can match a specific file or a pattern for multiple files, with each URI having a maximum length of 2000 characters."],["The `dataSchema` field dictates the parsing method for the source data, with options including `document`, `content`, `custom`, and `csv` for document imports, and `user_event` for user event imports."],["Each request can contain up to 100 files, or up to 100,000 files if the dataSchema is `content`, with each file having a maximum size limit of 2 GB or 100 MB respectively."]]],[]]