Manually sifting through documents like purchase orders (POs) to find specific information is a time-consuming and often frustrating task. What if you could simply ask questions in plain English and get immediate, accurate answers? This is now possible with MySQL HeatWave GenAI.
MySQL HeatWave is a fully managed MySQL database service that combines transactions, analytics, machine learning, and GenAI services, without ETL duplication. Also included is MySQL HeatWave Lakehouse, which allows users to query data in object storage, MySQL databases, or a combination of both. Users can deploy MySQL HeatWave–powered apps on a choice of public clouds: Oracle Cloud Infrastructure (OCI), Amazon Web Services (AWS), and Microsoft Azure.
This post demonstrates how you can use HeatWave GenAI to transform an unstructured PDF of a purchase order into a structured, queryable format directly within your database. We’ll walk through the process of loading a PO from an object store and then asking natural language questions to retrieve key details like order dates, item costs, and supplier information.
The Two-Step Process
The core of this powerful capability lies in two main functions:
- VECTOR_STORE_LOAD: This function connects to a document in an object store, automatically parses it, and loads it into a HeatWave table.
- ML_RAG: This procedure allows you to query the contents of the loaded document using natural language questions.
Step 1: Loading the Purchase Order PDF
With the connection established, the next step is to load the Sample PO PDF. The document is stored in an OCI object store bucket. We use the VECTOR_STORE_LOAD function to load the document from its URI into a new table called vector_store_data_1.
CALL sys.VECTOR_STORE_LOAD( "oci://bucket-name@tenancy-name/unstructured_data/vision/Sample_P0.pdf", JSON_OBJECT( "schema_name", "ml_benchmark", "table_name", "vector_store_data_1", "document_parser_model", "meta.llama-4-scout-17b-16e-instruct" ) );
The above load is asynchronous by default, and you can monitor the progress of the above load by running the query provided as task_status_query from the above load.
Step 2: Asking Questions with ML_RAG
Once the document is loaded and processed, you can start asking questions using the ML_RAG procedure. It’s as simple as passing your question as a string.
Query 1: What is the order date?
CALL sys.ML_RAG("What is the order date?", @output, NULL); Answer: 'The order date is April 22, 2025.'
Query 2: What was the printer cost?
CALL sys.ML_RAG("what was the printer cost?", @output, NULL); Answer: 'The price of the printer is $250.00.'
Query 3: Who was the supplier?
CALL sys.ML_RAG("Who was the supplier?", @output, NULL); Answer: 'The supplier is CV_SuppA01, which has an address of Gruvfaltsgatan 45, 768 90 Kiruna, SWEDEN.'
As you can see, HeatWave GenAI can understand the questions and retrieve precise details from the purchase order, showcasing its powerful document understanding capabilities.
Try It Yourself
This demonstration highlights how MySQL HeatWave GenAI can revolutionize how you interact with unstructured data. By integrating document parsing and natural language querying directly into the database, it eliminates complex data extraction pipelines and makes information more accessible than ever.
We invite you to try HeatWave AutoML and GenAI. If you’re new to Oracle Cloud Infrastructure, try Oracle Cloud Free Trial, a free 30-day trial with US$300 in credits.

