Announcing generative extraction, a new feature in OCI Document Understanding that uses the latest generative AI models to dramatically simplify large-scale document processing for items like invoices, purchase orders, resume screening, fraud detection, and more. Traditional document extraction often relies on manual annotation, rigid templates, and layout-specific rules. These approaches create bottlenecks that slow business processing and delay time to market. With generative extraction, organizations can automate extraction workflows by defining the fields they need in natural language and leverage the latest generative AI models to deliver accurate, simple, and large-scale custom document extraction.
Generative extraction uses state-of-the-art multimodal vision models to understand documents and return structured results in JSON. Under the hood, the system applies purpose-built pre-processing and post-processing logic to improve accuracy, stabilize outputs, and reduce hallucinations. You define the fields you want to extract. The system learns what those fields mean and consistently extracts them across invoices, forms, statements, contracts, and other semi-structured or unstructured documents, even when layouts vary.
What generative extraction does
- Understands fields from natural language descriptions
- Learns from a few examples when higher accuracy is needed
- Works on multi page, mixed layout, and multilingual documents
- Normalizes values into a consistent schema
- Handles both unstructured and semi structured content
- Integrates directly into existing Custom KV workflows with no pipeline changes
Why we built it
- Purpose-built accuracy for high-variance documents: Foundational Gen AI models, alone, are not sufficient for high-variance, high-accuracy data extraction. Real-world documents vary widely across layouts such as forms, receipts, resumes, tables, and long multi-page files, and general-purpose models can misinterpret these differences. Generative extraction is purpose-built for document understanding and reinforced with built-in guardrails that promote consistent outputs, maintain high accuracy across document types, and mitigate hallucinations.
- Faster time to market through simplicity: Developers often spend significant time labeling documents, training custom extraction models, and maintaining regex rules, layout maps, and post-processing logic to handle format variations. Generative extraction removes this overhead by letting customers define extraction fields in natural language, without extensive annotation or retraining. The system adapts to new formats automatically, reduces maintenance, and enables document-heavy use cases to scale faster.
- Designed for production-scale document pipelines: Enterprise workflows require predictable behavior, stable schemas, and seamless integration. Generative extraction delivers normalized JSON outputs, supports large document volumes, and integrates directly into existing Custom KV pipelines without disrupting downstream systems.
Getting started
Generative extraction powered by generative AI is available now as part of OCI Document Understanding, streamlining enterprise-scale, document-heavy tasks with high efficiency and accuracy. Customers can quicken their time to market by simply creating a dedicated endpoint and defining their case-specific extraction fields in natural language.
Ready to get started with AI-powered generative extraction?
For more information, see the following resources:
- Get started with generative extraction in OCI Document Understanding
- General OCI Document Understanding documentation
