Skip to content

Layout, blocks, geometry

Block

Reading-order-grouped paragraph with bounding box, layout class, and content. Emitted on the response when include_blocks=True.

Block

Bases: _Frozen

ATTRIBUTE DESCRIPTION
id

TYPE: int

layout_id

TYPE: int

class_name

TYPE: Annotated[str, Field(alias='class')]

bounding_box

TYPE: BoundingBox

content

TYPE: str

order_index

TYPE: int

id instance-attribute

id: int

layout_id instance-attribute

layout_id: int

class_name instance-attribute

class_name: Annotated[str, Field(alias='class')]

bounding_box instance-attribute

bounding_box: BoundingBox

content instance-attribute

content: str

order_index instance-attribute

order_index: int

TextItem

A single recognised word/line with confidence and bounding box.

TextItem

Bases: _Frozen

ATTRIBUTE DESCRIPTION
text

TYPE: str

confidence

TYPE: float

bounding_box

TYPE: BoundingBox

id

TYPE: int | None

layout_id

TYPE: int | None

source

TYPE: Literal['ocr', 'pdf', 'geometric', 'auto', 'auto_verified'] | None

text instance-attribute

text: str

confidence instance-attribute

confidence: float

bounding_box instance-attribute

bounding_box: BoundingBox

id class-attribute instance-attribute

id: int | None = None

layout_id class-attribute instance-attribute

layout_id: int | None = None

source class-attribute instance-attribute

source: Literal['ocr', 'pdf', 'geometric', 'auto', 'auto_verified'] | None = None

BoundingBox

BoundingBox

Bases: _Frozen

ATTRIBUTE DESCRIPTION
points

TYPE: Annotated[Quad, Field(description='4-corner polygon')]

aabb

TYPE: tuple[int, int, int, int]

center

TYPE: Point

width

TYPE: int

height

TYPE: int

points instance-attribute

points: Annotated[Quad, Field(description='4-corner polygon')]

aabb cached property

aabb: tuple[int, int, int, int]

center cached property

center: Point

width cached property

width: int

height cached property

height: int

LayoutBox

LayoutBox

Bases: _Frozen

ATTRIBUTE DESCRIPTION
class_name

TYPE: Annotated[str, Field(alias='class')]

class_id

TYPE: int

confidence

TYPE: float

bounding_box

TYPE: BoundingBox

id

TYPE: int | None

class_name instance-attribute

class_name: Annotated[str, Field(alias='class')]

class_id instance-attribute

class_id: int

confidence instance-attribute

confidence: float

bounding_box instance-attribute

bounding_box: BoundingBox

id class-attribute instance-attribute

id: int | None = None

LayoutLabel

LayoutLabel

Bases: StrEnum

ATTRIBUTE DESCRIPTION
abstract

algorithm

aside_text

chart

content

display_formula

doc_title

figure_title

footer

footer_image

footnote

formula_number

header

header_image

image

inline_formula

number

paragraph_title

reference

reference_content

seal

table

text

vertical_text

vision_footnote

supplementary_region

abstract class-attribute instance-attribute

abstract = 'abstract'

algorithm class-attribute instance-attribute

algorithm = 'algorithm'

aside_text class-attribute instance-attribute

aside_text = 'aside_text'

chart class-attribute instance-attribute

chart = 'chart'

content class-attribute instance-attribute

content = 'content'

display_formula class-attribute instance-attribute

display_formula = 'display_formula'

doc_title class-attribute instance-attribute

doc_title = 'doc_title'

figure_title class-attribute instance-attribute

figure_title = 'figure_title'

footer class-attribute instance-attribute

footer = 'footer'

footer_image class-attribute instance-attribute

footer_image = 'footer_image'

footnote class-attribute instance-attribute

footnote = 'footnote'

formula_number class-attribute instance-attribute

formula_number = 'formula_number'

header class-attribute instance-attribute

header = 'header'

header_image class-attribute instance-attribute

header_image = 'header_image'

image class-attribute instance-attribute

image = 'image'

inline_formula class-attribute instance-attribute

inline_formula = 'inline_formula'

number class-attribute instance-attribute

number = 'number'

paragraph_title class-attribute instance-attribute

paragraph_title = 'paragraph_title'

reference class-attribute instance-attribute

reference = 'reference'

reference_content class-attribute instance-attribute

reference_content = 'reference_content'

seal class-attribute instance-attribute

seal = 'seal'

table class-attribute instance-attribute

table = 'table'

text class-attribute instance-attribute

text = 'text'

vertical_text class-attribute instance-attribute

vertical_text = 'vertical_text'

vision_footnote class-attribute instance-attribute

vision_footnote = 'vision_footnote'

supplementary_region class-attribute instance-attribute

supplementary_region = 'SupplementaryRegion'

Table

Table

Bases: _Frozen

ATTRIBUTE DESCRIPTION
id

TYPE: int

bounding_box

TYPE: BoundingBox

text

TYPE: str

html

TYPE: str | None

cells

TYPE: list[list[str]] | None

id instance-attribute

id: int

bounding_box instance-attribute

bounding_box: BoundingBox

text instance-attribute

text: str

html class-attribute instance-attribute

html: str | None = None

cells class-attribute instance-attribute

cells: list[list[str]] | None = None

Formula

Formula

Bases: _Frozen

ATTRIBUTE DESCRIPTION
id

TYPE: int

bounding_box

TYPE: BoundingBox

text

TYPE: str

is_inline

TYPE: bool

latex

TYPE: str | None

id instance-attribute

id: int

bounding_box instance-attribute

bounding_box: BoundingBox

text instance-attribute

text: str

is_inline class-attribute instance-attribute

is_inline: bool = False

latex class-attribute instance-attribute

latex: str | None = None

Tables and formulas — partial support today

As of server v2.2.3, the server detects table and formula regions (you get a bounding_box and row-major OCR'd text) but does not emit cell structure or LaTeX source. Table.html, Table.cells, and Formula.latex are always None. The SDK is forward-compatible: when the server ships table-structure-recognition and LaTeX OCR, those fields will populate without any SDK code changes.