New Features in HeatWave GenAI

We recently announced the general availability of HeatWave GenAI – industry’s first in-database LLMs, an automated, in-database vector store, scale-out vector processing, and the ability to have contextual conversations in natural language. To make it easier for customers to adopt generative AI, new features are added to focus the areas listed below:

Enable new applications
Improve ease of application development
Improve latency and throughput
Multi-cloud support

Enable new applications

Multi-lingual Support

When we announced the new HeatWave GenAI capabilities, HeatWave only supported documents in English language for vector store creation. For customers who want non-English natural language interaction, we provided a translation service within HeatWave GenAI to enable customers to interact in their non-English native language with their documents in English.

HeatWave GenAI now adds multi-lingual support for more than two dozen languages, allowing usage of all HeatWave GenAI APIs in non-English languages. This includes ingestion of documents written in languages other than English, similarity search and querying these documents by prompts in the same language.

Let’s explore the new capabilities with the following example. We will create two vector store tables, one by loading a document in English, and the other by loading a document in German.

call sys.vector_store_load(‘uri-to-english-doc’, ‘{“language”: “en”}’);
call sys.vector_store_load(‘uri-to-german-doc’, ‘{“language”: “de”}’);

The language parameter is specified using a valid 2-letter ISO-639-1 code. Below shows a few examples of the codes, for a complete list, see Supported Languages, Embedding Models, and LLMs.

Language	Code
English	en
French	fr
German	de
Spanish	es
Portuguese	pt
Hindi	hi
Chinese	zh

We use the ML_RAG routine to create responses to our questions. The language in which the question is asked should align with the language passed in the “language” parameter.

For example, when asking a question in English, specifying the language parameter as “en”:

call sys.ML_RAG(‘some question in English’, @output, ‘{“model_options”: {“language”: “en”}}’);
SELECT JSON_PRETTY(@output) \G;
**************************** 1. row ***************************
JSON_PRETTY(@output): {
“text”: ” Response in English…”,
“citations”: [
{
“segment”: Segment in English…”,
“distance”: 0.36409956216812134,
“document_name”:”https://objectstorage.uk-london-1.oraclecloud.com/pdf_files/doc_en.pdf”},
…
]}

when asking a question in German, specifying the language parameter as “de”:

call sys.ML_RAG(‘eine Frage auf Deutsch’, @out, ‘{“model_options”: {“language”: “de”}}’);
SELECT JSON_PRETTY(@output) \G;
**************************** 1. row ***************************
JSON_PRETTY(@output): {
“text”: ” Antwort auf Deutsch…”,
“citations”: [
{
“segment”: Segment auf Deutsch…”,
“distance”: 0.23449526313815189,
“document_name”:”https://objectstorage.uk-london- 1.oraclecloud.com/pdf_files/doc_de.pdf”},
…
]}

In each case, ML_RAG lookup vector store documents in the specified language. The best matching segments then get passed to the LLM as context, alongside an instruction to respond in the specified language.

Multi-lingual support also applied to other HeatWave GenAI routines:

ML_GENERATE:

SELECT sys.ML_GENERATE(@query, ‘{“language”: language_code}’);

HEATWAVE_CHAT:

SET @chat_options = ‘{“model_options”: {“language”: language_code}}’;
CALL sys.HEATWAVE_CHAT(@query);

By default, for languages other than English, HeatWave GenAI uses in-database LLM model “llama3-8b-instruct-v1”. Users have the flexibility to use other LLMs such as OCI Generative AI LLM model “cohere.command-r-plus” for non-English languages.

Optical Character Recognition (OCR) support

Many organizations receive information from print media such as printed contract and documents. Often time, organizations store these documents by scanning them into digital image format. With the support of OCR, HeatWave GenAI can now read and ingest the text information from image files including these scanned documents into vector embeddings, enabling enterprise to query information from these documents for similarity search and RAG.

Improve ease of application development

Generative AI is a new technology, developing applications for it can require AI expertise. The following features make it simpler for developers to use HeatWave GenAI:

In-database JavaScript support for GenAI

Earlier this year, we introduced the support of JavaScript in HeatWave MySQL stored procedures and functions. As generative AI and LLMs primarily handle textual and JSON data, it makes them a natural choice for manipulating in JavaScript.

We have added native support for vector data type in JavaScript and the ability to invoke GenAI capabilities from a JavaScript program. This allows developers to preprocess prompts based on SQL data, invoke LLMs directly within the database, and post-process the responses with ease.

Javascript and GenAI

The table below shows the SQL and JavaScript functions for HeatWave GenAI:

Use case	SQL function	JavaScript function
Text generation/summarization	ML_GENERATE(prompt, options)	ml.generate(prompt, options)
RAG	ML_RAG(query, options)	ml.rag(query, options)
Embedding generation	ML_EMBED(query, options)	ml.embed(query, options)

For details on how to use HeatWave GenAI in JavaScript programs, please go to HeatWave GenAI documentation.

Support for OCI Generative AI models

HeatWave GenAI in-database LLMs uses quantized LLM models such as Mistral-7B-Instruct and Llama-3-8B-Instruct, which use smaller set of parameters, but allows HeatWave GenAI to run the LLM model in same CPU compute resources in HeatWave. This enables us to provide generative AI capabilities at a lower cost, provides a more secure environment, and delivers predictable performance across all OCI regions and across other clouds.

In addition to HeatWave in-database LLMs, HeatWave GenAI integrates with OCI Generative AI Service, allowing customers to use larger, more powerful LLM models provided by OCI Generative AI service. These models are invoked the same way as HeatWave in-database LLMs, by simply specifying their respective model ID. For example, to generate text:

To call HeatWave in-database LLMs:

SELECT sys.ML_GENERATE(@query, ‘{“model_id”: “mistral-7b-instruct-v1” }’);

To call OCI Generative AI LLMs within HeatWave:

SELECT sys.ML_GENERATE(@query, ‘{“model_id”: “ cohere.command-r-plus” }’);

To use OCI Generative AI service, users need to explicitly sign up for OCI Generative AI service and the usage of the OCI Generative AI LLM models is charged separately by the OCI Generative AI service. Below table shows the LLM models currently supported by OCI Generative AI service:

LLM model_id

cohere.command-r-plus

cohere.command-r-16k

meta.llama-3-70b-instruct

OCI Generative AI Service also supports embedding models which can be used with HeatWave GenAI. Table below shows the embedding models currently supported by OCI Generative AI service. The first two models can be used for inputs in English language, while the second two are for non-English languages.

Embedding model_id

cohere.embed-english-light-v3.0

cohere.embed-english-v3.0

cohere.embed-multilingual-light-v3.0

cohere.embed-multilingual-v3.0

Similar to invoking LLMs, users can specify the model_id for the OCI Generative AI embedding model for embedding generation:

SELECT sys.ML_EMBED_ROW(@input, ‘{“model_id”: “cohere.embed-english-light-v3.0” }’);

Latency and throughput improvement

LLM inference batch processing

Improving HeatWave GenAI in-database LLM performance and throughput helps organizations to handle more loads with lower costs. HeatWave GenAI now supports LLM inference batch processing. This allows customers to run GenAI routines such as generate or RAG on multiple queries in parallel across different nodes in the HeatWave cluster, improving LLM inference performance and throughput while keeping the inference latency of each of the request unchanged.

Users now have the flexibility to query for a single inference request or multiple inference requests in a single query. Below shows the list of HeatWave Generative AI routines:

Use case	Single query	Batch processing
Text generation or summarization	ML_GENERATE	ML_GENERATE_TABLE
RAG	ML_RAG	ML_RAG_TABLE
Embedding generation	ML_EMBED_ROW	ML_EMBED_TABLE

To facilitate batch processing, requests are stored in a table. The table can be a HeatWave Lakehouse table where the requests are loaded from a file such as CSV file in object storage, or a MySQL InnoDB table.

Examples of running the batch processing routines:

CALL sys.ML_GENERATE_TABLE('db.input_table.query_column', 'db.output_table.out_column', JSON_OBJECT('context_column', 'context'));

CALL sys.ML_RAG_TABLE('db.input_table.query_column', 'db.input_table.out_column, JSON_OBJECT())

Multicloud Support

HeatWave GenAI Support in HeatWave on AWS

With the support of HeatWave GenAI on AWS, AWS customers can develop secure generative AI applications at a lower cost. They can automate vector store creation and vector embedding generation, use in-database large language models (LLMs) running on CPUs or optionally use Amazon Bedrock foundation models, get the best price-performance in the industry for vector processing, and have natural language conversations with their documents in Amazon S3.

For similarly search, HeatWave GenAI on AWS is 39x better price performance than snowflake, 96x better than Databricks and 48x better than Google Big Query.

Amazon Bedrock

For invoking LLMs, users have the flexibility to choose to use HeatWave in-database LLMs or Amazon Bedrock foundation models such as Cohere and Llama models. To use Amazon Bedrock, there will be a separate charge by AWS for accessing the Amazon Bedrock foundation models.

Below is a list of supported AWS Bedrock models. Follow these instructions to enable access to them.

Model_id

cohere.command-text-v14

cohere.command-light-text-v14

meta.llama2-70b-chat-v1

meta.llama3-70b-instruct-v1:0

HeatWave

HeatWave GenAI provides the same functionalities as HeatWave GenAI on OCI. It supports use cases such as content generation, summarization, and RAG (Retrieval-Augmented Generation). Let us see how to use HeatWave GenAI to generate contents related to the Paris 2024 Olympic games.

Content Generation

In this section, we use HeatWave GenAI as an assistant to prepare for a presentation about the Olympic games.

Set your query.

MySQL> set @query = “I’m preparing a presentation on the impact of the Olympics on the economy and society. Could you suggest a few key points that I should include?”;

Run the query to generate the desired content.

MySQL> SELECT sys.ML_GENERATE(@query, JSON_OBJECT(“model_id”, “llama3-8b-instruct-v1”));

HeatWave GenAI generates the content for you.

| {“text”: “\n\nWhat a great topic! The Olympic Games have a significant impact on both the economy and society, and there are many interesting points to explore. Here are some key points you may want to consider including in your presentation:\n\n**Economic Impact:**\n\n1. **Job creation**: Hosting the Olympics creates thousands of temporary jobs during the construction phase, as well as permanent jobs in industries such as hospitality, tourism, and sports management.\n2. **Infrastructure development**: The Olympics often lead to infrastructure upgrades, such as new transportation systems, stadiums, and public facilities, which can benefit the local community long after the Games are over.\n3. **Tourism boost**: The Olympics attract millions of visitors from around the world, generating significant revenue for local businesses, including hotels, restaurants, and shops.\n4. **Economic legacy**: Many Olympic host cities experience a lasting economic impact, with some studies suggesting that the Games can lead to long-term GDP growth and increased investment in the region.\n\n**Societal Impact:**\n\n1. **Community engagement**: The Olympics often bring people together, fostering community spirit and social cohesion through events like torch relays, cultural festivals, and volunteer opportunities.\n2. **Health and wellness**: The Olympics promote physical activity and healthy lifestyles, inspiring people to adopt”} |

Content summarization

Here, we use HeatWave GenAI to summarize Wikipedia’s introduction about the Olympic games.

Set the text you want to summarize.

MySQL> set @text = “The modern Olympic Games are the world’s leading international sporting events. They feature summer and winter sports competitions in which thousands of athletes from around the world participate in a variety of competitions. The Olympic Games are considered the world’s foremost sports competition, with more than 200 teams, representing sovereign states and territories, participating. By default, the Games generally substitute for any world championships during the year in which they take place (however, each class usually maintains its own records). The Olympic Games are held every four years. Since 1994, they have alternated between the Summer and Winter Olympics every two years during the four-year Olympiad.”;

Run the query to summarize the contents of the file.

MySQL> SELECT sys.ML_GENERATE(@text, JSON_OBJECT(“task”, “summarization”, “model_id”, “llama3-8b-instruct-v1”));

HeatWave GenAI summarizes the document for you.

| {“text”: ” The modern Olympic Games are global sporting events with summer and winter sports competitions involving thousands of athletes from around the world. They are considered the world’s top sports competition with over 200 participating teams. The Games substitute for any world championships during their year and are held every four years, alternating between Summer and Winter Olympics every two years.”} |

Retrieval Augmented Generation

In this section, we load an unstructured document (a PDF file) that contains some statistics about the Paris 2024 Olympic games from an S3 bucket into HeatWave and use HeatWave GenAI to answer some questions about the Paris 2024 Olympic.

Load the files that contain propriety enterprise data in the HeatWave cluster.

MySQL> SET @dl_tables = ‘[{“db_name”: “vector_store”, “tables”: [{“table_name”: “paris_2024”, “dialect”: {“format”: “pdf”}, “file”:
[{“prefix”: “s3://YOUR_BUCKET_NAME/paris_2024.pdf”}]}]}]’;
MySQL> SET @options = JSON_OBJECT(‘policy’, ‘disable_unsupported_columns’, ‘external_tables’, CAST(@dl_tables AS JSON));
MySQL> CALL sys.HEATWAVE_LOAD(@dl_tables, @options);

Run the query to generate more accurate and contextually relevant responses with RAG.

MySQL> CALL sys.ML_RAG(“Answer the following questions: 1. Which country won the most medals at the Paris Olympics, and what was the total medal count? 2. How many world records were broken at the Paris Olympic games? 3. How many apartments were built for the Paris Olympic games?”, @output, NULL);

HeatWave GenAI sifts the documents and provides you with the response with the correct context.

MySQL> | {
“text”: ” 1. The United States won the most medals at the Paris Olympics, with a total of 126 medals.\n\n2. There were 17 world records broken at the Paris Olympic Games.\n\n3. An Olympic Village of 2,800 apartments was built for the Paris Olympic Games.”, “citations”: … } |

Natural Language Interaction

Let’s interact with HeatWave in natural language. For this purpose, we use HeatWave to retrieve information about the country that won the most medals at the Paris 2024 Olympic Games (this information is contained in the PDF file loaded into HeatWave) and then ask a follow-up question about that country.

Use HEATWAVE_CHAT to retrieve the information of interest.

MySQL> SET @chat_options = JSON_OBJECT(“model_options”, JSON_OBJECT(“model_id”, ‘cohere.command-text-v14’));
MySQL> call sys.heatwave_chat(“What is the capital of the country that won the most medals at the Paris Olympics?”);

HeatWave GenAI sifts the documents and provides you with the response with the correct context.

+——————————————————+
| response |
+——————————————————+
| The capital of the United States is Washington D.C. |
+——————————————————+

Ask a follow-up question about the previous answer.

MySQL> call sys.HEATWAVE_CHAT(“Name other cities in this country.”);
+——————————————————+
| response |
+——————————————————+
| I can’t make a list of cities in the USA since this would generate a large number of results, but here are some other American cities:
1. New York
2. Los Angeles
3. Chicago
4. Miami
5. Atlanta
6. Boston

The list could continue but these are some of the major cities in the USA. |
+——————————————————+

Additional resources:

New Features in HeatWave GenAI

Enable new applications

Multi-lingual Support

Optical Character Recognition (OCR) support

In-database JavaScript support for GenAI

Support for OCI Generative AI models

LLM inference batch processing

Multicloud Support

HeatWave GenAI Support in HeatWave on AWS

Content Generation

Content summarization

Retrieval Augmented Generation

Natural Language Interaction

Sanjay Jinturkar

Senior Director, GenAI and Machine Learning

HeatWave Chat: How to Create Your Personal Chatbot

HeatWave GenAI for e-Commerce Applications: Part 2

New Features in HeatWave GenAI

Enable new applications

Multi-lingual Support

Optical Character Recognition (OCR) support

In-database JavaScript support for GenAI

Support for OCI Generative AI models

LLM inference batch processing

Multicloud Support

HeatWave GenAI Support in HeatWave on AWS

Content Generation

Content summarization

Retrieval Augmented Generation

Natural Language Interaction

Authors

Sanjay Jinturkar

Senior Director, GenAI and Machine Learning

HeatWave Chat: How to Create Your Personal Chatbot

HeatWave GenAI for e-Commerce Applications: Part 2