TextigoAI pulls structured context out of any visual document — invoices, contracts, IDs, business cards, handwritten notes, whiteboard photos, screenshots, scanned PDFs, even sketched diagrams. Not just text — entities, relationships, intent. Give it your JSON schema, get it back filled in.
{ "vendor": "Acme Supply Co.", "invoice_no": "INV-2026-0481", "date": "2026-06-14", "bill_to": "Demo Co. Ltd.", "line_items": [ 4 ], "subtotal": 267.00, "total": 290.69| }
The OCR problem
A character is not a number. A line of text is not a line item. A signature isn't a name, a date, and a binding party. TextigoAI reads the document the way a person does — layout, entities, relationships, intent — and hands you back exactly the shape your system expects.
What it sees
From a crisp PDF to a phone photo of a napkin sketch — one extraction surface, one schema-aware output.
Vendor, dates, line items, tax, totals, payment terms — itemized and totals-reconciled, ready to post.
Clauses, effective & renewal dates, signatories, named entities and the binding obligations between them.
VCard-grade contact extraction, MRZ/PDF417 parsing, role and organization with confidence scoring.
Phone-photo-grade messy in, clean structured out — meeting notes, intake forms, scribbled to-do lists.
Pull the underlying data series from a chart image — bar, line, scatter, pie — with axes and labels.
Scrape what you see on screen, including dark-mode dashboards, tables, sidebars, and modal forms.
The difference
Same input. Two very different outputs.
{ "document_type": "invoice", "vendor": { "name": "Acme Supply Co.", "address": "412 Mercer St, Brooklyn NY" }, "invoice_no": "INV-2026-0481", "issued_at": "2026-06-14", "bill_to": "Demo Co. Ltd.", "line_items": [ { "desc": "Hex bolts 5/16 box", "qty": 12, "amt": 84.00 }, { "desc": "Galv anchor plate", "qty": 4, "amt": 112.00 } ], "subtotal": 267.00, "tax": 23.69, "total": 290.69, "reconciled": true }
How it works
Drag-and-drop in the playground or POST to the API — same engine, same output.
Drop any image, PDF, or photo into the playground — or POST it to /v1/extract. PNG, JPG, HEIC, TIFF, PDF, multi-page — all native.
Vision model + layout-aware OCR + entity recognition run in one pass. The model sees the page like a person — columns, headers, signatures, marks, hand-corrections.
Output to your JSON schema, your CSV columns, or fire it straight at your webhook. Confidence scores and a per-field audit trail come along for free.
Schema-aware output
Tell us the shape you want back. TextigoAI binds the document to your fields — coercing types, normalizing units, picking the right entity for each slot — and gives you back a record that drops cleanly into your database, your CRM, your accounting system, your model input.
{ "type": "object", "required": ["vendor", "total", "issued_at"], "properties": { "vendor": { "type": "string" }, "issued_at": { "type": "string", "format": "date" }, "total": { "type": "number", "minimum": 0 }, "currency": { "enum": ["USD","EUR","GBP"] }, "po_number": { "type": "string", "pattern": "^PO-\\d+$" } } }
{ "vendor": "Acme Supply Co.", "issued_at": "2026-06-14", "total": 290.69, "currency": "USD", "po_number": "PO-44812" }
PII & compliance
Documents are full of things you do not want sitting in plaintext in your data lake. TextigoAI flags, redacts, or vaults PII the moment it's extracted — before a single field is written downstream.
API surface
REST · JSON in, JSON out · sync or async · idempotent.
# single doc → structured JSON curl https://api.textigo.ai/v1/extract \ -H "Authorization: Bearer $TXG_KEY" \ -F "file=@invoice.pdf" \ -F "redact_pii=true"
{ "id": "doc_2ff1c9...", "document_type": "invoice", "data": { "vendor": "Acme Supply Co.", ... }, "confidence": 0.97, "latency_ms": 418 }
Where it ships
TextigoAI is the extraction layer under several Gridspin verticals — same engine, vertical-specific schemas.
Extract from PDF benefits booklets, carrier proposals, and SBCs — into normalized plan comparisons.
broker.gridspin.xyzPull paper intake forms, PROs, and clinic-facing screeners into structured longitudinal records.
therapeutics.gridspin.xyzParse resumes, scanned DISC forms, and reference letters into a uniform candidate object — bias-aware.
recruiter.gridspin.xyzWhy teams switch
A single endpoint, your schema, your webhook. No templates to author, no field-mapping spreadsheets to maintain.
Stop maintaining four parsers. Invoices, IDs, contracts, and screenshots all share the same engine.
Redaction, vaulting, and signed audit logs come standard — pass your next compliance review without rework.
Open the playground, drag in your messiest invoice, contract, or photo. See it come back as the JSON you actually want — in under a second.