Get started¶

Typed Python client for the TurboOCR server. Sync + async, HTTP + gRPC, layout-aware Markdown rendering, searchable-PDF generation.

This page is the entire usage surface in one read. Pick the section that matches what you want to do, copy the code, run it.

Install¶

pip install turboocr             # HTTP + CLI + searchable-PDF
pip install 'turboocr[grpc]'     # add the gRPC transport
pip install 'turboocr[all]'      # everything optional

Requires Python 3.12+.

The three things you'll do 90% of the time¶

Task	One-liner	Full snippet below
Image → text	`client.recognize_image("page.png")`	Image OCR
PDF → Markdown	`render_to_markdown(client.recognize_pdf("doc.pdf")).markdown`	PDF to Markdown
PDF or image → searchable PDF	`client.make_searchable_pdf("scan.jpg")`	Searchable PDF

Start a server¶

The client talks to a TurboOCR server (the OCR engine itself). Easiest way to get one running:

docker run --gpus all -p 8000:8000 -p 50051:50051 \
  -v trt-cache:/home/ocr/.cache/turbo-ocr \
  -e OCR_LANG=latin \
  ghcr.io/aiptimizer/turboocr:v2.2.3

OCR_LANG=latin covers English, French, German, Spanish, …. Swap for chinese, greek, eslav, arabic, korean, or thai — all baked in. The first start primes the TRT engine cache (~30 s); subsequent starts are instant.

Image → text & layout¶

from turboocr import Client

with Client(base_url="http://localhost:8000") as client:
    response = client.recognize_image("page.png")

print(f"{len(response.results)} text items")
for item in response.results[:3]:
    print(f"  {item.text!r} (conf={item.confidence:.2f})")

response.results is a list of TextItems. Each has .text, .confidence, and .bounding_box.

For paragraph grouping + layout classes (paragraph_title, table, formula, …) and reading order, pass three more flags:

from turboocr import Client

with Client(base_url="http://localhost:8000") as client:
    response = client.recognize_image(
        "page.png",
        layout=True,
        reading_order=True,
        include_blocks=True,
    )

print(f"{len(response.results)} text items, {len(response.blocks)} blocks")
for block in response.blocks:
    x0, y0, x1, y1 = block.bounding_box.aabb
    print(f"  [{block.class_name}] ({x0},{y0})-({x1},{y1})")
    print(f"      {block.content[:80]!r}")

response.blocks is the reading-order-grouped paragraphs; response.layout is the per-region layout boxes without text grouping.

PDF → Markdown¶

from turboocr import Client, render_to_markdown

with Client(base_url="http://localhost:8000") as client:
    response = client.recognize_pdf(
        "report.pdf", dpi=150, include_blocks=True
    )

doc = render_to_markdown(response)
print(f"pages={len(response.pages)} chars={len(doc.markdown)}")
print(doc.markdown[:500])

The renderer walks the reading order and maps each layout class to a Markdown construct (doc_title → # H1, display_formula → $$ … $$, table → fenced block, etc.). Customise the mapping with MarkdownStyle — see examples/09_markdown_style.py for a runnable demo.

Searchable PDF¶

Generate a PDF with an invisible text overlay aligned to page geometry — selectable, copyable, full-text-searchable in every viewer. Input can be a PDF or a single-page image. Tested against PNG, JPEG, BMP, TIFF, GIF, and WebP; the SDK detects format via magic bytes and wraps images into a one-page PDF automatically:

from pathlib import Path
from turboocr import Client

with Client(base_url="http://localhost:8000") as client:
    overlay = client.make_searchable_pdf("scan.pdf", dpi=200)   # PDF in
    # or:
    overlay = client.make_searchable_pdf("photo.jpg", dpi=200)  # image in

Path("scan.searchable.pdf").write_bytes(overlay)

Non-Latin scripts (CJK, Arabic, Cyrillic, …) work without setup — the bundled glyphless font covers every BMP codepoint. See Non-Latin PDFs only if you need to override the default font.

Async¶

Same surface, await-prefixed. Pair with asyncio.gather to fan out:

import asyncio
from turboocr import AsyncClient

IMAGES = ["a.png", "b.png", "c.png"]

async def main() -> None:
    async with AsyncClient(base_url="http://localhost:8000") as client:
        responses = await asyncio.gather(
            *(client.recognize_image(img) for img in IMAGES)
        )
    for img, resp in zip(IMAGES, responses, strict=True):
        print(f"{img}: {len(resp.results)} items")

asyncio.run(main())

For folder-scale workloads, see the folder-pipeline recipe.

Configuration cheat-sheet¶

from turboocr import Client, RetryPolicy

client = Client(
    base_url="http://localhost:8000",   # or TURBO_OCR_BASE_URL env
    api_key="sk-...",                   # or TURBO_OCR_API_KEY env
    auth_scheme="bearer",               # "bearer" | "x-api-key"
    timeout=30.0,                       # per-request, seconds
    default_headers={"X-Tenant": "acme"},
    retry=RetryPolicy(attempts=5, backoff=0.5),
)

Pass http_client=httpx.Client(...) for custom TLS, mTLS, proxies, or connection limits — see Custom httpx.Client.

Retry defaults: HTTP {429, 502, 503, 504}, gRPC {UNAVAILABLE, DEADLINE_EXCEEDED, RESOURCE_EXHAUSTED}, 3 attempts, exponential backoff + jitter, Retry-After honoured. Tune via RetryPolicy(...) — see Configure retries.

CLI¶

turbo-ocr ocr page.png --output markdown
turbo-ocr pdf report.pdf --dpi 150 --output json
turbo-ocr searchable-pdf scan.pdf -o out.pdf --font-path /path/to/font.ttf
turbo-ocr health --ready

--output accepts json | blocks | text | markdown. Full surface at CLI reference.

Where to go next¶

You want…	Go to
A recipe for a specific problem	How-to guides
A long-form walkthrough	Tutorials
Method signatures + types	API reference
Conceptual background	Explanation
Runnable scripts against bundled fixtures	Examples

Server compatibility¶

SERVER_API_VERSION_MIN and SERVER_API_VERSION_MAX_EXCLUSIVE document the supported server range. Response models use extra="allow" so additive server changes (e.g. a new request_id field) are preserved on .model_extra instead of crashing on parse.

Versioning¶

Names exported by turboocr.__all__ are the public API. Underscored modules (_core, _http, _grpc) are internal and may change at any time. Pre-1.0, breaking changes are signalled by a minor-version bump; deprecated public APIs emit DeprecationWarning and stay supported for at least one minor version after deprecation.