GcsIngestPipeline(mapping=None, *, ignore_unknown_fields=False, **kwargs)
The configuration of the Cloud Storage Ingestion pipeline.
Attributes |
|
---|---|
Name | Description |
input_path |
str
The input Cloud Storage folder. All files under this folder will be imported to Document Warehouse. Format: gs:// .
|
schema_name |
str
The Document Warehouse schema resource name. All documents processed by this pipeline will use this schema. Format: projects/{project_number}/locations/{location}/documentSchemas/{document_schema_id}. |
processor_type |
str
The Doc AI processor type name. Only used when the format of ingested files is Doc AI Document proto format. |
skip_ingested_documents |
bool
The flag whether to skip ingested documents. If it is set to true, documents in Cloud Storage contains key "status" with value "status=ingested" in custom metadata will be skipped to ingest. |
pipeline_config |
google.cloud.contentwarehouse_v1.types.IngestPipelineConfig
Optional. The config for the Cloud Storage Ingestion pipeline. It provides additional customization options to run the pipeline and can be skipped if it is not applicable. |