Vision + LLM Any visual document → structured, schema-aware JSON

Read what's in front of you.
Then make sense of it.

TextigoAI pulls structured context out of any visual document — invoices, contracts, IDs, business cards, handwritten notes, whiteboard photos, screenshots, scanned PDFs, even sketched diagrams. Not just text — entities, relationships, intent. Give it your JSON schema, get it back filled in.

Try the playground See how it works

20+ doc types Schema-aware PII-aware by design

extracted

14 fields · 0.42s

Invoice

Acme Supply Co.

412 Mercer St, Brooklyn NY

No.

INV-2026-0481

Bill to

Demo Co. Ltd.

99 Field Rd, Austin TX

Date

2026-06-14

ItemQtyAmt

Hex bolts 5/16 box12$ 84.00

Galv anchor plate4$112.00

Self-leveling compound2$ 46.00

Delivery & handling1$ 25.00

Subtotal$267.00 Tax (8.875%)$ 23.69

Total $290.69

output.json

LIVE

{
  "vendor": "Acme Supply Co.",
  "invoice_no": "INV-2026-0481",
  "date": "2026-06-14",
  "bill_to": "Demo Co. Ltd.",
  "line_items": [ 4 ],
  "subtotal": 267.00,
  "total": 290.69|
}

schema match conf 0.97

context

entities · rels · intent

20+ doc types

invoices, IDs, forms, whiteboards, charts, handwriting…

schema-aware

bring your JSON shape — get it back filled in

PII-aware

redact, vault & audit sensitive fields by default

ms latency

sub-second for most documents

The OCR problem

Legacy OCR gives you a wall of text.
You wanted a record you can act on.

A character is not a number. A line of text is not a line item. A signature isn't a name, a date, and a binding party. TextigoAI reads the document the way a person does — layout, entities, relationships, intent — and hands you back exactly the shape your system expects.

What it sees

Every document. Any layout.

From a crisp PDF to a phone photo of a napkin sketch — one extraction surface, one schema-aware output.

Invoices & receipts

Vendor, dates, line items, tax, totals, payment terms — itemized and totals-reconciled, ready to post.

Contracts & forms

Clauses, effective & renewal dates, signatories, named entities and the binding obligations between them.

Business cards & IDs

VCard-grade contact extraction, MRZ/PDF417 parsing, role and organization with confidence scoring.

Handwriting & whiteboards

Phone-photo-grade messy in, clean structured out — meeting notes, intake forms, scribbled to-do lists.

Charts & diagrams

Pull the underlying data series from a chart image — bar, line, scatter, pie — with axes and labels.

Screenshots & UI

Scrape what you see on screen, including dark-mode dashboards, tables, sidebars, and modal forms.

The difference

Context, not just text.

Same input. Two very different outputs.

Legacy OCR

Text only

ACME SUPPLY CO 412 MERCER ST BROOKLYN NY INVOICE INV-2026-0481 BILL TO Demo Co. Ltd 99 Field Rd Austin TX DATE 2026-06-14 ITEM QTY AMT Hex bolts 5/16 box 12 84.00 Galv anchor plate 4 112.OO Self-leveling compound 2 46.OO Delivery and handling 1 25.00 Subtotal 267.OO Tax 8.875% 23.69 Total 290.69

Wall of strings — no fields, no types
"112.OO" — character confusion bleeds into your DB
Reading order broken — columns merged across rows

TextigoAI

Structured

{
  "document_type": "invoice",
  "vendor": {
    "name": "Acme Supply Co.",
    "address": "412 Mercer St, Brooklyn NY"
  },
  "invoice_no": "INV-2026-0481",
  "issued_at": "2026-06-14",
  "bill_to": "Demo Co. Ltd.",
  "line_items": [
    { "desc": "Hex bolts 5/16 box", "qty": 12, "amt": 84.00 },
    { "desc": "Galv anchor plate", "qty": 4,  "amt": 112.00 }
  ],
  "subtotal": 267.00,
  "tax": 23.69,
  "total": 290.69,
  "reconciled": true
}

Typed fields, normalized dates, currency-aware numbers
Math reconciled — flagged when totals don't match line items
Layout & entity-aware — "bill to" vs "ship to" never confused

How it works

Three steps. One round trip.

Drag-and-drop in the playground or POST to the API — same engine, same output.

Upload

Drop any image, PDF, or photo into the playground — or POST it to /v1/extract. PNG, JPG, HEIC, TIFF, PDF, multi-page — all native.

Extract

Vision model + layout-aware OCR + entity recognition run in one pass. The model sees the page like a person — columns, headers, signatures, marks, hand-corrections.

Structure

Output to your JSON schema, your CSV columns, or fire it straight at your webhook. Confidence scores and a per-field audit trail come along for free.

Schema-aware output

Bring your own schema.

Tell us the shape you want back. TextigoAI binds the document to your fields — coercing types, normalizing units, picking the right entity for each slot — and gives you back a record that drops cleanly into your database, your CRM, your accounting system, your model input.

Any JSON Schema — we honor types, enums, required, and patterns.
Per-field confidence — gate writes on a threshold, fall back to human-in-the-loop.
Unit & format normalization — "Jun 14, '26" → 2026-06-14.
Multi-page & multi-doc — one schema, applied across batches.

Your schema in

schema.json

{
  "type": "object",
  "required": ["vendor", "total", "issued_at"],
  "properties": {
    "vendor":     { "type": "string" },
    "issued_at":  { "type": "string", "format": "date" },
    "total":      { "type": "number", "minimum": 0 },
    "currency":   { "enum": ["USD","EUR","GBP"] },
    "po_number":  { "type": "string", "pattern": "^PO-\\d+$" }
  }
}

Filled out

conf 0.97

{
  "vendor":     "Acme Supply Co.",
  "issued_at":  "2026-06-14",
  "total":      290.69,
  "currency":   "USD",
  "po_number":  "PO-44812"
}

Redaction in flight

on by default

Subject name vaulted · tok_a3f...

Card number redacted · **** 4221

SSN redacted · XXX-XX-XXXX

Address vaulted · tok_a3f...

Invoice total $290.69 · allowed

Audit log x-trace log_2ff1...9c4 · signed

PII & compliance

Sensitive fields, handled by design.

Documents are full of things you do not want sitting in plaintext in your data lake. TextigoAI flags, redacts, or vaults PII the moment it's extracted — before a single field is written downstream.

Redact PII before storage — configurable per field, per environment.
Vault sensitive values and exchange a reversible token to your app.
Signed audit logs — tamper-evident per-request trace, ready for SOC 2.
On-prem & VPC deployment available for regulated workloads.

API surface

A small, sharp surface.

REST · JSON in, JSON out · sync or async · idempotent.

VerbEndpointPurpose

POST /v1/extract Single doc — returns structured JSON synchronously.

POST /v1/extract/batch Bulk — queue many docs, get a job id back.

POST /v1/extract/with-schema Bind to your JSON Schema — typed, validated output.

GET /v1/jobs/<id> Async job status · webhook on completion.

POST /v1/redact PII redaction only — pass-through, no extraction.

Request

# single doc → structured JSON
curl https://api.textigo.ai/v1/extract \
  -H "Authorization: Bearer $TXG_KEY" \
  -F "file=@invoice.pdf" \
  -F "redact_pii=true"

Response

{
  "id": "doc_2ff1c9...",
  "document_type": "invoice",
  "data": { "vendor": "Acme Supply Co.", ... },
  "confidence": 0.97,
  "latency_ms": 418
}

Where it ships

Wired into the rest of the family.

TextigoAI is the extraction layer under several Gridspin verticals — same engine, vertical-specific schemas.

Brokerage

Extract from PDF benefits booklets, carrier proposals, and SBCs — into normalized plan comparisons.

broker.gridspin.xyz

Therapeutics

Pull paper intake forms, PROs, and clinic-facing screeners into structured longitudinal records.

therapeutics.gridspin.xyz

Recruiter

Parse resumes, scanned DISC forms, and reference letters into a uniform candidate object — bias-aware.

recruiter.gridspin.xyz

Why teams switch

Replace your OCR. Keep the rest of your stack.

Ship in a day

A single endpoint, your schema, your webhook. No templates to author, no field-mapping spreadsheets to maintain.

One layer, many docs

Stop maintaining four parsers. Invoices, IDs, contracts, and screenshots all share the same engine.

PII safe by default

Redaction, vaulting, and signed audit logs come standard — pass your next compliance review without rework.

Throw a document at it.

Open the playground, drag in your messiest invoice, contract, or photo. See it come back as the JSON you actually want — in under a second.

Try the playground Read the API docs

Legacy OCR gives you a wall of text.You wanted a record you can act on.