Oracle HeatWave GenAI provides integrated and automated generative AI with in-database large language models (LLMs); an automated, in-database vector store; scale-out vector processing; and the ability to have contextual conversations in natural language – letting you take advantage of generative AI without AI expertise, data movement, or additional cost.
HeatWave Vector Store provides simple, fast, scalable, fully integrated, and automated vector store creation. Users only need to execute one simple and familiar SQL command. Under the hood, this involves a multi-step process including reading unstructured data in PDF, HTML, Word, TXT, or PowerPoint formats from object storage, parsing the text out of these documents, partitioning it into smaller semantically sound segments, generating vector embeddings from them, and finally, storing the embeddings in a standard HeatWave Lakehouse table. You may read more about this in the HeatWave GenAI Technical Brief.
You can use GenAI to query your own documents using natural language, and this post will explain how easy it is to get started. You will need a basic knowledge of Oracle Cloud Infrastructure (OCI), but you don't have to be an expert. If you need detailed instructions, please see the HeatWave GenAI Getting Started Guide. To use this guide, you will need to create a compartment, a Virtual Cloud Network (VCN), a compute instance, set some permissions, install MySQL Shell, etc. But all of these steps – except for database creation – may be found in this blog post. (Note: you don't need to install Terraform or the OCI Resource Manager). This post is a little old, so do not create the "MySQL System". Instead, you will want to create a HeatWave instance, and for this, you may use this post. You may already have a compartment, a VCN and a compute instance – and if so, then the rest of the steps will be fairly simple.
When you create a HeatWave instance on OCI, you will want to be sure to enable Heatwave and Lakehouse. You may follow these instructions, and the only exception is that when you get to the part where you select the MySQL version, you will need MySQL version 9.0.1 or greater.
Now that you have created your compute and HeatWave instances, you may create an object storage bucket, and upload your documents. Supported documents include PDF, PPT, TXT, HTML, and DOC formats. If you have a large number of documents to load into object storage, follow the instructions in this post on how to upload documents via a simple BASH script using the Oracle Command Line interface.
Once your documents have been loaded, connect to your compute instance, launch MySQL Shell and connect to your HeatWave instance. You should see something like this:
$ mysqlsh -uadmin -h10.0.1.186 --sqlc MySQL Shell 8.0.35 Copyright (c) 2016, 2023, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type '\help' or '\?' for help; '\quit' to exit. Creating a Classic session to 'admin@10.0.1.186' Fetching global names for auto-completion... Press ^C to stop. Your MySQL connection id is 496 Server version: 9.0.1-cloud MySQL Enterprise - Cloud No default schema selected; type \use <schema> to set one. MySQL 10.0.1.186:3306 ssl SQL >
For this example, I created an object storage bucket named NASA_bucket and uploaded 70 e-book PDF's that I downloaded from NASA. With the PDF's in the bucket, now I just need to load these PDF's into HeatWave. I will create a database named nasa, and then make that my active database:
create database nasa; use nasa;
Since this is the first time I have loaded documents into this instance, I need to call the following method to create a schema used for task management:
select mysql_task_management_ensure_schema();
Now I can load the documents from the bucket using this format:
Example:
call sys.vector_store_load('oci://<BucketName>@<Namespace>/<Path>/<Filename>',
'{"table_name": "<EmbeddingsTableName>"}');
And here is the command I used:
call sys.vector_store_load('oci://NASA_bucket@my_namespace/', '{"table_name": "nasa"}');
You will see a response which will include a query you can use to can see the progress of the data load. It will look like this, but the id value at the end might be different if you have already loaded files earlier:
SELECT id, name, message, progress, status, scheduled_time, estimated_completion_time, estimated_remaining_time, progress_bar FROM mysql_task_management.task_status WHERE id=1\G
*************************** 1. row ***************************
id: 1
name: Vector Store Loader
message: Task starting.
progress: 0
status: RUNNING
scheduled_time: 2024-08-14 16:18:35
estimated_completion_time: NULL
estimated_remaining_time: NULL
progress_bar: __________
1 row in set (0.0022 sec)
After the documents have been loaded, I can see what tables were created. Since I am only loading PDF's, only one table will be created – and it is named nasa_pdf.
show tables; +----------------+ | Tables_in_nasa | +----------------+ | nasa_pdf | +----------------+
If you loaded multiple document formats, you may have multiple tables. I can see how many vectors were stored in this table by doing a simple select count(*) from that table:
select count(*) from nasa_pdf; +----------+ | count(*) | +----------+ | 115158 | +----------+
There are two LLM's available for you to use – mistral-7b-instruct-v1 and llama2-7b-v1. To load an LLM, use one of these statements:
CALL sys.ML_MODEL_LOAD("mistral-7b-instruct-v1", NULL);
CALL sys.ML_MODEL_LOAD("llama2-7b-v1", NULL);
You are now ready to use natural language to search the data. There are three ways to ask questions. (Be sure to first select which database you want to use via use database;)
CALL sys.heatwave_chat("question_in_natural_language");
set @query=“question_in_natural_language”;
select sys.ML_GENERATE(@query, JSON_OBJECT('max_tokens', 2048));
CALL sys.ML_RAG("question_in_natural_language?", @output, NULL);
SELECT JSON_PRETTY(@output) \G;
The heatwave_chat routine automatically calls the ML_RAG routine which loads an LLM and runs a semantic search on the available vector stores by default. If the routine cannot find a vector store, then it calls the ML_GENERATE routine and uses information available in LLM training data, which is primarily information that is available in public data sources, to generate a response for the entered query.
I will ask the same question using each of the three methods, so you can see the different outputs:
call sys.heatwave_chat("How many Space Shuttle missions were flown?")\G
*************************** 1. row ***************************
response: The total number of Space Shuttle missions flown is 135.
set @query="How many Space Shuttle flights have been flown?";
select sys.ML_GENERATE(@query, JSON_OBJECT('max_tokens', 2048))\G
*************************** 1. row ***************************
sys.ML_GENERATE(@query, JSON_OBJECT('max_tokens', 2048)): {"text": "The total number of Space Shuttle flights was 135."}
CALL sys.ML_RAG("How many Space Shuttle flights have been flown?", @output, NULL);
SELECT JSON_PRETTY(@output) \G;
*************************** 1. row ***************************
JSON_PRETTY(@output): {
"text": "There have been 135 Space Shuttle flights.",
"citations": [
{
"segment": "1. These include 2 suborbital X-15 flights, 2 suborbital Mercury flights, 4 orbital Mercury flights, 10 Gemini flights, 12 Apollo flights, 3 Skylab flights, and 135 Space Shuttle flights; 6 Vostok flights, 2 Voskhod flights, 1 suborbital Soyuz flights, 110 orbital Soyuz flights, 3 Shenzhou flights, and 3 suborbital SpaceShipOne flights.",
"distance": 0.2939179539680481,
"document_name": "https://objectstorage.us-ashburn-1.oraclecloud.com/n/my_namespace/b/NASA_bucket/o/not_yet_imagined_tagged.pdf"
},
{
"segment": "Starting with the first Shuttle flight on 12 April 1981, 355 individuals from 16 countries flew 852 times aboard the Shuttle. The five Shuttles traveled more than 542 million miles (872 million kilometers) and hosted more than 2,000 experiments in the fields of Earth, astronomical, biological, and materials sci -ences. Shuttles deployed 180 payloads, including satellites, returned 52 from space and retrieved, repaired and redeployed seven spacecraft. HST was the most famous deployment from",
"distance": 0.3383563160896301,
"document_name": "https://objectstorage.us-ashburn-1.oraclecloud.com/n/my_namespace/b/NASA_bucket/o/695726main_ComingHome-ebook.pdf"
},
{
"segment": "the first Shuttle flight of 1996 was highlighted by the retrieval of a Japanese satellite, the deployment and retrieval of a NaSa science payload, and two spacewalks.",
"distance": 0.341661274433136,
"document_name": "https://objectstorage.us-ashburn-1.oraclecloud.com/n/my_namespace/b/NASA_bucket/o/656773main_ShuttleRetrospectiveBook-ebook.pdf"
}
],
"vector_store": [
"`nasa`.`nasa_pdf`"
]
}
The sys.heatwave_chat option quickly provides you with a response, and there aren't any options which you can use to modify the search criteria.
For the sys.ML_GENERATE method, you have several options to use. As an example, I provided an option for the number of max_tokens generated. (For this question, I didn't really need to use a high value of 2048 – I just wanted to show how to use the options feature). If you add additional options, they must be in JSON format. A full list of the options is available here.
The sys.ML_RAG method provides you with more information than the other two methods. The output generated by RAG is comprised of two parts:
– The text section contains the text-based content generated by the LLM as a response for your query.
– The citations section shows the segments and documents it referred to as context.
If you want to clear the previous chat history and states, use this to reset the session:
SET @chat_options=NULL;
And that's all you need to do in order to play around with HeatWave GenAI. There are obviously a lot more features and functionality available to you, so please check out the HeatWave manual to learn more.
