Cloud Document AI v1beta3 API - Class Document (2.0.0-beta27)

public sealed class Document : IMessage<Document>, IEquatable<Document>, IDeepCloneable<Document>, IBufferMessage, IMessage

Reference documentation and code samples for the Cloud Document AI v1beta3 API class Document.

Document represents the canonical document resource in Document AI. It is an interchange format that provides insights into documents and allows for collaboration between users and Document AI to iterate and optimize for quality.

Inheritance

object > Document

Namespace

Google.Cloud.DocumentAI.V1Beta3

Assembly

Google.Cloud.DocumentAI.V1Beta3.dll

Constructors

Document()

public Document()

Document(Document)

public Document(Document other)
Parameter
Name Description
other Document

Properties

BlobAssets

public RepeatedField<Document.Types.BlobAsset> BlobAssets { get; }

Optional. The blob assets in this document. This is used to store the content of the inline blobs in this document, e.g. image bytes, such that it can be referenced by other fields in the document via asset id.

Property Value
Type Description
RepeatedFieldDocumentTypesBlobAsset

ChunkedDocument

public Document.Types.ChunkedDocument ChunkedDocument { get; set; }

Document chunked based on chunking config.

Property Value
Type Description
DocumentTypesChunkedDocument

Content

public ByteString Content { get; set; }

Optional. Inline document content, represented as a stream of bytes. Note: As with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64.

Property Value
Type Description
ByteString

Docid

public string Docid { get; set; }

Optional. An internal identifier for document. Should be loggable (no PII).

Property Value
Type Description
string

DocumentLayout

public Document.Types.DocumentLayout DocumentLayout { get; set; }

Parsed layout of the document.

Property Value
Type Description
DocumentTypesDocumentLayout

Entities

public RepeatedField<Document.Types.Entity> Entities { get; }

A list of entities detected on [Document.text][google.cloud.documentai.v1beta3.Document.text]. For document shards, entities in this list may cross shard boundaries.

Property Value
Type Description
RepeatedFieldDocumentTypesEntity

EntitiesRevisionId

public string EntitiesRevisionId { get; set; }

The entity revision id that document.entities field is based on. If this field is set and entities_revisions is not empty, the entities in document.entities field are the entities in the entity revision with this id and document.entity_validation_output field is the entity_validation_output field in this entity revision.

Property Value
Type Description
string

EntitiesRevisions

public RepeatedField<Document.Types.EntitiesRevision> EntitiesRevisions { get; }

A list of entity revisions. The entity revisions are appended to the document in the processing order. This field can be used for comparing the entity extraction results at different stages of the processing.

Property Value
Type Description
RepeatedFieldDocumentTypesEntitiesRevision

EntityRelations

public RepeatedField<Document.Types.EntityRelation> EntityRelations { get; }

Placeholder. Relationship among [Document.entities][google.cloud.documentai.v1beta3.Document.entities].

Property Value
Type Description
RepeatedFieldDocumentTypesEntityRelation

EntityValidationOutput

public Document.Types.EntityValidationOutput EntityValidationOutput { get; set; }

The entity validation output for the document. This is the validation output for document.entities field.

Property Value
Type Description
DocumentTypesEntityValidationOutput

Error

public Status Error { get; set; }

Any error that occurred while processing this document.

Property Value
Type Description
Status

HasContent

public bool HasContent { get; }

Gets whether the "content" field is set

Property Value
Type Description
bool

HasUri

public bool HasUri { get; }

Gets whether the "uri" field is set

Property Value
Type Description
bool

MimeType

public string MimeType { get; set; }

An IANA published media type (MIME type).

Property Value
Type Description
string

Pages

public RepeatedField<Document.Types.Page> Pages { get; }

Visual page layout for the [Document][google.cloud.documentai.v1beta3.Document].

Property Value
Type Description
RepeatedFieldDocumentTypesPage

Revisions

public RepeatedField<Document.Types.Revision> Revisions { get; }

Placeholder. Revision history of this document.

Property Value
Type Description
RepeatedFieldDocumentTypesRevision

ShardInfo

public Document.Types.ShardInfo ShardInfo { get; set; }

Information about the sharding if this document is sharded part of a larger document. If the document is not sharded, this message is not specified.

Property Value
Type Description
DocumentTypesShardInfo

SourceCase

public Document.SourceOneofCase SourceCase { get; }
Property Value
Type Description
DocumentSourceOneofCase

Text

public string Text { get; set; }

Optional. UTF-8 encoded text in reading order from the document.

Property Value
Type Description
string

TextChanges

public RepeatedField<Document.Types.TextChange> TextChanges { get; }

Placeholder. A list of text corrections made to [Document.text][google.cloud.documentai.v1beta3.Document.text]. This is usually used for annotating corrections to OCR mistakes. Text changes for a given revision may not overlap with each other.

Property Value
Type Description
RepeatedFieldDocumentTypesTextChange

TextStyles

[Obsolete]
public RepeatedField<Document.Types.Style> TextStyles { get; }

Styles for the [Document.text][google.cloud.documentai.v1beta3.Document.text].

Property Value
Type Description
RepeatedFieldDocumentTypesStyle

Uri

public string Uri { get; set; }

Optional. Currently supports Google Cloud Storage URI of the form gs://bucket_name/object_name. Object versioning is not supported. For more information, refer to Google Cloud Storage Request URIs.

Property Value
Type Description
string