RagFileParsingConfig

Specifies the parsing config for RagFiles.

Fields
useAdvancedPdfParsing
(deprecated)
boolean

Whether to use advanced PDF parsing.

parser Union type
The parser to use for RagFiles. parser can be only one of the following:
advancedParser object (AdvancedParser)

The Advanced Parser to use for RagFiles.

layoutParser object (LayoutParser)

The Layout Parser to use for RagFiles.

llmParser object (LlmParser)

The LLM Parser to use for RagFiles.

JSON representation
{
  "useAdvancedPdfParsing": boolean,

  // parser
  "advancedParser": {
    object (AdvancedParser)
  },
  "layoutParser": {
    object (LayoutParser)
  },
  "llmParser": {
    object (LlmParser)
  }
  // Union type
}

AdvancedParser

Specifies the advanced parsing for RagFiles.

Fields
useAdvancedPdfParsing boolean

Whether to use advanced PDF parsing.

JSON representation
{
  "useAdvancedPdfParsing": boolean
}

LayoutParser

Document AI Layout Parser config.

Fields
processorName string

The full resource name of a Document AI processor or processor version. The processor must have type LAYOUT_PARSER_PROCESSOR. If specified, the additionalConfig.parse_as_scanned_pdf field must be false. Format: * projects/{projectId}/locations/{location}/processors/{processorId} * projects/{projectId}/locations/{location}/processors/{processorId}/processorVersions/{processor_version_id}

maxParsingRequestsPerMin integer

The maximum number of requests the job is allowed to make to the Document AI processor per minute. Consult https://cloud.google.com/document-ai/quotas and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 120 QPM would be used.

globalMaxParsingRequestsPerMin integer

The maximum number of requests the job is allowed to make to the Document AI processor per minute in this project. Consult https://cloud.google.com/document-ai/quotas and the Quota page for your project to set an appropriate value here. If this value is not specified, maxParsingRequestsPerMin will be used by indexing pipeline as the global limit.

JSON representation
{
  "processorName": string,
  "maxParsingRequestsPerMin": integer,
  "globalMaxParsingRequestsPerMin": integer
}