- 3.0.0 (latest)
- 2.35.0
- 2.34.0
- 2.33.0
- 2.32.0
- 2.30.0
- 2.29.3
- 2.28.0
- 2.27.1
- 2.26.0
- 2.25.0
- 2.24.2
- 2.23.0
- 2.22.0
- 2.21.1
- 2.20.2
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.1
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.1
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.1
- 2.3.0
- 2.2.0
- 2.1.0
- 2.0.3
- 1.5.1
- 1.4.2
- 1.3.0
- 1.2.1
- 1.1.0
- 1.0.0
- 0.5.2
- 0.4.0
- 0.3.0
- 0.2.0
- 0.1.0
Page(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A page in a Document.
Attributes | |
---|---|
Name | Description |
page_number |
int
1-based index for current Page in a parent Document. Useful when a page is taken out of a Document for individual processing. |
image |
google.cloud.documentai_v1.types.Document.Page.Image
Rendered image for this page. This image is preprocessed to remove any skew, rotation, and distortions such that the annotation bounding boxes can be upright and axis-aligned. |
transforms |
Sequence[google.cloud.documentai_v1.types.Document.Page.Matrix]
Transformation matrices that were applied to the original document image to produce Page.image. |
dimension |
google.cloud.documentai_v1.types.Document.Page.Dimension
Physical dimension of the page. |
layout |
google.cloud.documentai_v1.types.Document.Page.Layout
Layout for the page. |
detected_languages |
Sequence[google.cloud.documentai_v1.types.Document.Page.DetectedLanguage]
A list of detected languages together with confidence. |
blocks |
Sequence[google.cloud.documentai_v1.types.Document.Page.Block]
A list of visually detected text blocks on the page. A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation. |
paragraphs |
Sequence[google.cloud.documentai_v1.types.Document.Page.Paragraph]
A list of visually detected text paragraphs on the page. A collection of lines that a human would perceive as a paragraph. |
lines |
Sequence[google.cloud.documentai_v1.types.Document.Page.Line]
A list of visually detected text lines on the page. A collection of tokens that a human would perceive as a line. |
tokens |
Sequence[google.cloud.documentai_v1.types.Document.Page.Token]
A list of visually detected tokens on the page. |
visual_elements |
Sequence[google.cloud.documentai_v1.types.Document.Page.VisualElement]
A list of detected non-text visual elements e.g. checkbox, signature etc. on the page. |
tables |
Sequence[google.cloud.documentai_v1.types.Document.Page.Table]
A list of visually detected tables on the page. |
form_fields |
Sequence[google.cloud.documentai_v1.types.Document.Page.FormField]
A list of visually detected form fields on the page. |
symbols |
Sequence[google.cloud.documentai_v1.types.Document.Page.Symbol]
A list of visually detected symbols on the page. |
provenance |
google.cloud.documentai_v1.types.Document.Provenance
The history of this page. |
Classes
Block
Block(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A block has a set of lines (collected into paragraphs) that have a common line-spacing and orientation.
DetectedLanguage
DetectedLanguage(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Detected language for a structural component.
Dimension
Dimension(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Dimension for the page.
FormField
FormField(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A form field detected on the page.
Image
Image(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Rendered image contents for this page.
Layout
Layout(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Visual element describing a layout unit on a page.
Line
Line(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A collection of tokens that a human would perceive as a line. Does not cross column boundaries, can be horizontal, vertical, etc.
Matrix
Matrix(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Representation for transformation matrix, intended to be compatible and used with OpenCV format for image manipulation.
Paragraph
Paragraph(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A collection of lines that a human would perceive as a paragraph.
Symbol
Symbol(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A detected symbol.
Table
Table(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A table representation similar to HTML table structure.
Token
Token(mapping=None, *, ignore_unknown_fields=False, **kwargs)
A detected token.
VisualElement
VisualElement(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Detected non-text visual elements e.g. checkbox, signature etc. on the page.