_BasePageElement(
documentai_object: typing.Union[
google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
google.cloud.documentai_v1.types.document.Document.Page,
google.cloud.documentai_v1.types.document.Document.Page.Token,
google.cloud.documentai_v1.types.document.Document.Page.Block,
google.cloud.documentai_v1.types.document.Document.Page.Symbol,
],
_page: google.cloud.documentai_toolbox.wrappers.page.Page,
)
Base class for representing a wrapped Document AI Page element (Symbol, Token, Line, Paragraph, Block).
Properties
_text_segment
Page element text segment.
hocr_bounding_box
hOCR bounding box of the page element.
text
Text of the page element.
Methods
_get_children_of_element
_get_children_of_element(
potential_children: typing.List[
google.cloud.documentai_toolbox.wrappers.page._BasePageElement
],
) -> typing.List[google.cloud.documentai_toolbox.wrappers.page._BasePageElement]
Filters potential child elements to identify only those fully contained within this element.
This method iterates through a list of potential child elements, checking if their start and end indices fall completely within the start and end indices of this element. Elements that are only partially contained or entirely outside this element's range are excluded.
Parameter | |
---|---|
Name | Description |
potential_children |
List[_BasePageElement]
Required. A list of wrapped page elements (e.g., words, lines, paragraphs) that could potentially be children of this element. |
Returns | |
---|---|
Type | Description |
List[_BasePageElement] |
A new list containing only the wrapped page elements that are fully contained within this element, maintaining their original order. |