Skip to content

Get started

Typed Python client for the TurboOCR server. Sync + async, HTTP + gRPC, layout-aware Markdown rendering, searchable-PDF generation.

This page is the entire usage surface in one read. Pick the section that matches what you want to do, copy the code, run it.

Install

pip install turboocr             # HTTP + CLI + searchable-PDF
pip install 'turboocr[grpc]'     # add the gRPC transport
pip install 'turboocr[all]'      # everything optional

Requires Python 3.12+.

The three things you'll do 90% of the time

Task One-liner Full snippet below
Image → text client.recognize_image("page.png") Image OCR
PDF → Markdown render_to_markdown(client.recognize_pdf("doc.pdf")).markdown PDF to Markdown
PDF or image → searchable PDF client.make_searchable_pdf("scan.jpg") Searchable PDF

Start a server

The client talks to a TurboOCR server (the OCR engine itself). Easiest way to get one running:

docker run --gpus all -p 8000:8000 -p 50051:50051 \
  -v trt-cache:/home/ocr/.cache/turbo-ocr \
  -e OCR_LANG=latin \
  ghcr.io/aiptimizer/turboocr:v2.2.3

OCR_LANG=latin covers English, French, German, Spanish, …. Swap for chinese, greek, eslav, arabic, korean, or thai — all baked in. The first start primes the TRT engine cache (~30 s); subsequent starts are instant.

Image → text & layout

from turboocr import Client

with Client(base_url="http://localhost:8000") as client:
    response = client.recognize_image("page.png")

print(f"{len(response.results)} text items")
for item in response.results[:3]:
    print(f"  {item.text!r} (conf={item.confidence:.2f})")

response.results is a list of TextItems. Each has .text, .confidence, and .bounding_box.

For paragraph grouping + layout classes (paragraph_title, table, formula, …) and reading order, pass three more flags:

from turboocr import Client

with Client(base_url="http://localhost:8000") as client:
    response = client.recognize_image(
        "page.png",
        layout=True,
        reading_order=True,
        include_blocks=True,
    )

print(f"{len(response.results)} text items, {len(response.blocks)} blocks")
for block in response.blocks:
    x0, y0, x1, y1 = block.bounding_box.aabb
    print(f"  [{block.class_name}] ({x0},{y0})-({x1},{y1})")
    print(f"      {block.content[:80]!r}")

response.blocks is the reading-order-grouped paragraphs; response.layout is the per-region layout boxes without text grouping.

PDF → Markdown

from turboocr import Client, render_to_markdown

with Client(base_url="http://localhost:8000") as client:
    response = client.recognize_pdf(
        "report.pdf", dpi=150, include_blocks=True
    )

doc = render_to_markdown(response)
print(f"pages={len(response.pages)} chars={len(doc.markdown)}")
print(doc.markdown[:500])

The renderer walks the reading order and maps each layout class to a Markdown construct (doc_title# H1, display_formula$$ … $$, table → fenced block, etc.). Customise the mapping with MarkdownStyle — see examples/09_markdown_style.py for a runnable demo.

Searchable PDF

Generate a PDF with an invisible text overlay aligned to page geometry — selectable, copyable, full-text-searchable in every viewer. Input can be a PDF or a single-page image. Tested against PNG, JPEG, BMP, TIFF, GIF, and WebP; the SDK detects format via magic bytes and wraps images into a one-page PDF automatically:

from pathlib import Path
from turboocr import Client

with Client(base_url="http://localhost:8000") as client:
    overlay = client.make_searchable_pdf("scan.pdf", dpi=200)   # PDF in
    # or:
    overlay = client.make_searchable_pdf("photo.jpg", dpi=200)  # image in

Path("scan.searchable.pdf").write_bytes(overlay)

Non-Latin scripts (CJK, Arabic, Cyrillic, …) work without setup — the bundled glyphless font covers every BMP codepoint. See Non-Latin PDFs only if you need to override the default font.

Async

Same surface, await-prefixed. Pair with asyncio.gather to fan out:

import asyncio
from turboocr import AsyncClient

IMAGES = ["a.png", "b.png", "c.png"]

async def main() -> None:
    async with AsyncClient(base_url="http://localhost:8000") as client:
        responses = await asyncio.gather(
            *(client.recognize_image(img) for img in IMAGES)
        )
    for img, resp in zip(IMAGES, responses, strict=True):
        print(f"{img}: {len(resp.results)} items")

asyncio.run(main())

For folder-scale workloads, see the folder-pipeline recipe.

Configuration cheat-sheet

from turboocr import Client, RetryPolicy

client = Client(
    base_url="http://localhost:8000",   # or TURBO_OCR_BASE_URL env
    api_key="sk-...",                   # or TURBO_OCR_API_KEY env
    auth_scheme="bearer",               # "bearer" | "x-api-key"
    timeout=30.0,                       # per-request, seconds
    default_headers={"X-Tenant": "acme"},
    retry=RetryPolicy(attempts=5, backoff=0.5),
)

Pass http_client=httpx.Client(...) for custom TLS, mTLS, proxies, or connection limits — see Custom httpx.Client.

Retry defaults: HTTP {429, 502, 503, 504}, gRPC {UNAVAILABLE, DEADLINE_EXCEEDED, RESOURCE_EXHAUSTED}, 3 attempts, exponential backoff + jitter, Retry-After honoured. Tune via RetryPolicy(...) — see Configure retries.

CLI

turbo-ocr ocr page.png --output markdown
turbo-ocr pdf report.pdf --dpi 150 --output json
turbo-ocr searchable-pdf scan.pdf -o out.pdf --font-path /path/to/font.ttf
turbo-ocr health --ready

--output accepts json | blocks | text | markdown. Full surface at CLI reference.

Where to go next

You want… Go to
A recipe for a specific problem How-to guides
A long-form walkthrough Tutorials
Method signatures + types API reference
Conceptual background Explanation
Runnable scripts against bundled fixtures Examples

Server compatibility

SERVER_API_VERSION_MIN and SERVER_API_VERSION_MAX_EXCLUSIVE document the supported server range. Response models use extra="allow" so additive server changes (e.g. a new request_id field) are preserved on .model_extra instead of crashing on parse.

Versioning

Names exported by turboocr.__all__ are the public API. Underscored modules (_core, _http, _grpc) are internal and may change at any time. Pre-1.0, breaking changes are signalled by a minor-version bump; deprecated public APIs emit DeprecationWarning and stay supported for at least one minor version after deprecation.