Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see Model versions and lifecycle.
Optional. Specifies the parsing config for RagFiles. RAG will use the default parser if this field is not set.
maxEmbeddingRequestsPerMin
integer
Optional. The max number of queries per minute that this job is allowed to make to the embedding model specified on the corpus. This value is specific to this job and not shared across other import jobs. Consult the Quotas page on the project to set an appropriate value here. If unspecified, a default value of 1,000 QPM would be used.
rebuildAnnIndex
boolean
Rebuilds the ANN index to optimize for recall on the imported data. Only applicable for RagCorpora running on RagManagedDb with retrieval_strategy set to ANN. The rebuild will be performed using the existing ANN config set on the RagCorpus. To change the ANN config, please use the UpdateRagCorpus API.
Default is false, i.e., index is not rebuilt.
import_source
Union type
The source of the import. import_source can be only one of the following:
Google Cloud Storage location. Supports importing individual files as well as entire Google Cloud Storage directories. Sample formats: - gs://bucketName/my_directory/objectName/my_file.txt - gs://bucketName/my_directory
Optional. If provided, all partial failures are written to the sink. Deprecated. Prefer to use the import_result_sink. partial_failure_sink can be only one of the following:
The BigQuery destination to write partial failures to. It should be a bigquery table resource name (e.g. "bq://projectId.bqDatasetId.bqTableId"). The dataset must exist. If the table does not exist, it will be created with the expected schema. If the table exists, the schema will be validated and data will be added to this existing table. Deprecated. Prefer to use import_result_bq_sink.
import_result_sink
Union type
Optional. If provided, all successfully imported files and all partial failures are written to the sink. import_result_sink can be only one of the following:
The BigQuery destination to write import result to. It should be a bigquery table resource name (e.g. "bq://projectId.bqDatasetId.bqTableId"). The dataset must exist. If the table does not exist, it will be created with the expected schema. If the table exists, the schema will be validated and data will be added to this existing table.
Required. BigQuery URI to a project or table, up to 2000 characters long.
When only the project is specified, the Dataset and Table is created. When the full table reference is specified, the Dataset must exist and table must not exist.
Accepted forms:
BigQuery path. For example: bq://projectId or bq://projectId.bqDatasetId or bq://projectId.bqDatasetId.bqTableId.
JSON representation
{"outputUri": string}
RagFileParsingConfig
Specifies the parsing config for RagFiles.
Fields
parser
Union type
The parser to use for RagFiles. parser can be only one of the following:
{// parser"layoutParser": {object (LayoutParser)},"llmParser": {object (LlmParser)}// Union type}
LayoutParser
Document AI Layout Parser config.
Fields
processorName
string
The full resource name of a Document AI processor or processor version. The processor must have type LAYOUT_PARSER_PROCESSOR. If specified, the additionalConfig.parse_as_scanned_pdf field must be false. Format: * projects/{projectId}/locations/{location}/processors/{processorId} * projects/{projectId}/locations/{location}/processors/{processorId}/processorVersions/{processor_version_id}
maxParsingRequestsPerMin
integer
The maximum number of requests the job is allowed to make to the Document AI processor per minute. Consult https://cloud.google.com/document-ai/quotas and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 120 QPM would be used.
The name of a LLM model used for parsing. Format: * projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}
maxParsingRequestsPerMin
integer
The maximum number of requests the job is allowed to make to the LLM model per minute. Consult https://cloud.google.com/vertex-ai/generative-ai/docs/quotas and your document size to set an appropriate value here. If unspecified, a default value of 5000 QPM would be used.
customParsingPrompt
string
The prompt to use for parsing. If not specified, a default prompt will be used.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-06-27 UTC."],[],[],null,["# Method: ragFiles.import\n\n**Full name**: projects.locations.ragCorpora.ragFiles.import\n\nImport files from Google Cloud Storage or Google Drive into a RagCorpus. \n\n### Endpoint\n\npost `https:``/``/aiplatform.googleapis.com``/v1``/{parent}``/ragFiles:import` \n\n### Path parameters\n\n`parent` `string` \nRequired. The name of the RagCorpus resource into which to import files. Format: `projects/{project}/locations/{location}/ragCorpora/{ragCorpus}`\n\n### Request body\n\nThe request body contains data with the following structure:\nFields `importRagFilesConfig` `object (`[ImportRagFilesConfig](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles/import#ImportRagFilesConfig)`)` \nRequired. The config for the RagFiles to be synced and imported into the RagCorpus. [VertexRagDataService.ImportRagFiles](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles/import#google.cloud.aiplatform.v1.VertexRagDataService.ImportRagFiles). \n\n### Response body\n\nIf successful, the response body contains an instance of [Operation](/vertex-ai/generative-ai/docs/reference/rest/Shared.Types/ListOperationsResponse#Operation).\n\nImportRagFilesConfig\n--------------------\n\nConfig for importing RagFiles.\nFields `ragFileTransformationConfig` `object (`[RagFileTransformationConfig](/vertex-ai/generative-ai/docs/reference/rest/v1/RagFileTransformationConfig)`)` \nSpecifies the transformation config for RagFiles.\n`ragFileParsingConfig` `object (`[RagFileParsingConfig](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles/import#RagFileParsingConfig)`)` \nOptional. Specifies the parsing config for RagFiles. RAG will use the default parser if this field is not set.\n`maxEmbeddingRequestsPerMin` `integer` \nOptional. The max number of queries per minute that this job is allowed to make to the embedding model specified on the corpus. This value is specific to this job and not shared across other import jobs. Consult the Quotas page on the project to set an appropriate value here. If unspecified, a default value of 1,000 QPM would be used.\n`rebuildAnnIndex` `boolean` \nRebuilds the ANN index to optimize for recall on the imported data. Only applicable for RagCorpora running on RagManagedDb with `retrieval_strategy` set to `ANN`. The rebuild will be performed using the existing ANN config set on the RagCorpus. To change the ANN config, please use the UpdateRagCorpus API.\n\nDefault is false, i.e., index is not rebuilt. \n`import_source` `Union type` \nThe source of the import. `import_source` can be only one of the following:\n`gcsSource` `object (`[GcsSource](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles#GcsSource)`)` \nGoogle Cloud Storage location. Supports importing individual files as well as entire Google Cloud Storage directories. Sample formats: - `gs://bucketName/my_directory/objectName/my_file.txt` - `gs://bucketName/my_directory`\n`googleDriveSource` `object (`[GoogleDriveSource](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles#GoogleDriveSource)`)` \nGoogle Drive location. Supports importing individual files as well as Google Drive folders.\n`slackSource` `object (`[SlackSource](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles#SlackSource)`)` \nSlack channels with their corresponding access tokens.\n`jiraSource` `object (`[JiraSource](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles#JiraSource)`)` \nJira queries with their corresponding authentication.\n`sharePointSources` `object (`[SharePointSources](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles#SharePointSources)`)` \nSharePoint sources. \n`partial_failure_sink` `Union type` \nOptional. If provided, all partial failures are written to the sink. Deprecated. Prefer to use the `import_result_sink`. `partial_failure_sink` can be only one of the following:\n`partialFailureGcsSink` \n**(deprecated)** `object (`[GcsDestination](/vertex-ai/generative-ai/docs/reference/rest/v1/GcsDestination)`)` \n| This item is deprecated!\n\nThe Cloud Storage path to write partial failures to. Deprecated. Prefer to use `importResultGcsSink`.\n`partialFailureBigquerySink` \n**(deprecated)** `object (`[BigQueryDestination](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles/import#BigQueryDestination)`)` \n| This item is deprecated!\n\nThe BigQuery destination to write partial failures to. It should be a bigquery table resource name (e.g. \"bq://projectId.bqDatasetId.bqTableId\"). The dataset must exist. If the table does not exist, it will be created with the expected schema. If the table exists, the schema will be validated and data will be added to this existing table. Deprecated. Prefer to use `import_result_bq_sink`. \n`import_result_sink` `Union type` \nOptional. If provided, all successfully imported files and all partial failures are written to the sink. `import_result_sink` can be only one of the following:\n`importResultGcsSink` `object (`[GcsDestination](/vertex-ai/generative-ai/docs/reference/rest/v1/GcsDestination)`)` \nThe Cloud Storage path to write import result to.\n`importResultBigquerySink` `object (`[BigQueryDestination](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles/import#BigQueryDestination)`)` \nThe BigQuery destination to write import result to. It should be a bigquery table resource name (e.g. \"bq://projectId.bqDatasetId.bqTableId\"). The dataset must exist. If the table does not exist, it will be created with the expected schema. If the table exists, the schema will be validated and data will be added to this existing table. \n\nBigQueryDestination\n-------------------\n\nThe BigQuery location for the output content.\nFields `outputUri` `string` \nRequired. BigQuery URI to a project or table, up to 2000 characters long.\n\nWhen only the project is specified, the Dataset and Table is created. When the full table reference is specified, the Dataset must exist and table must not exist.\n\nAccepted forms:\n\n- BigQuery path. For example: `bq://projectId` or `bq://projectId.bqDatasetId` or `bq://projectId.bqDatasetId.bqTableId`. \n\nRagFileParsingConfig\n--------------------\n\nSpecifies the parsing config for RagFiles.\nFields \n`parser` `Union type` \nThe parser to use for RagFiles. `parser` can be only one of the following:\n`layoutParser` `object (`[LayoutParser](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles/import#LayoutParser)`)` \nThe Layout Parser to use for RagFiles.\n`llmParser` `object (`[LlmParser](/vertex-ai/generative-ai/docs/reference/rest/v1/projects.locations.ragCorpora.ragFiles/import#LlmParser)`)` \nThe LLM Parser to use for RagFiles. \n\nLayoutParser\n------------\n\nDocument AI Layout Parser config.\nFields `processorName` `string` \nThe full resource name of a Document AI processor or processor version. The processor must have type `LAYOUT_PARSER_PROCESSOR`. If specified, the `additionalConfig.parse_as_scanned_pdf` field must be false. Format: \\* `projects/{projectId}/locations/{location}/processors/{processorId}` \\* `projects/{projectId}/locations/{location}/processors/{processorId}/processorVersions/{processor_version_id}`\n`maxParsingRequestsPerMin` `integer` \nThe maximum number of requests the job is allowed to make to the Document AI processor per minute. Consult \u003chttps://cloud.google.com/document-ai/quotas\u003e and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 120 QPM would be used. \n\nLlmParser\n---------\n\nSpecifies the LLM parsing for RagFiles.\nFields `modelName` `string` \nThe name of a LLM model used for parsing. Format: \\* `projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}`\n`maxParsingRequestsPerMin` `integer` \nThe maximum number of requests the job is allowed to make to the LLM model per minute. Consult \u003chttps://cloud.google.com/vertex-ai/generative-ai/docs/quotas\u003e and your document size to set an appropriate value here. If unspecified, a default value of 5000 QPM would be used.\n`customParsingPrompt` `string` \nThe prompt to use for parsing. If not specified, a default prompt will be used."]]