Last year, we announced Select AI Retrieval Augmented Generation (RAG) for Oracle Autonomous Database, which uses Oracle Database 23ai AI Vector Search and transformers hosted by your choice of supported AI providers. Cloud-based transformers (also known as embedding models) require the data to make a round trip to and from the transformer for creating the vector index, as well as for each prompt. We’re announcing general availability for in-database transformers with Select AI RAG that streamlines the process: It now eliminates the round trip by using transformers with the in-database ONNX Runtime. Further, imported transformers become first-class database model objects, taking advantage of database backup, recovery, and security.
Select AI with RAG simplifies and automates key RAG steps, which helps bridge the knowledge gap between what your large language model (LLM) knows and the knowledge in your specific content. With “traditional” RAG implementations, you need to manually code the RAG workflow to orchestrate the various components.
How to specify an in-database transformer with Select AI RAG
With Select AI RAG, you specify a transformer using the “embedding_model” attribute in your AI profile. You would then specify the “provider” attribute set to the AI provider hosting your model.
Similarly, in-database embedding models can be specified in an AI profile using the “embedding_model” attribute in one of two ways:
- If your AI profile is for creating the vector index only, specify the “provider” as “database” and give the name of the in-database transformer for the “embedding_model.”
- If your AI profile will also specify an LLM, specify the “provider” of the LLM, e.g., “oci”, and the “embedding_model” with the name of the in-database transformer prefixed with“database:”
Here’s an example of an AI profile specifying an in-database transformer.
BEGIN DBMS_CLOUD_AI.CREATE_PROFILE( profile_name => 'OCI_GENAI', attributes => '{"provider": "oci", "model": "meta.llama-3.1-70b-instruct", "vector_index_name": "MY_INDEX", "embedding_model": "database: USER1.ALL_MINILM_L12_V2"}' ); END;
The name of our profile is “OCI_GENAI.” We’re using the OCI (Oracle Cloud Infrastructure) Generative AI Service, so we specify the provider as “oci”. While we could omit the LLM “model” specification and use the default, here we specify it explicitly as “meta.llama-3.1-70b-instruct.” To enable RAG, we specify the “vector_index_name”, which is the name of the index we might have created using the DBMS_CLOUD_AI_CREATE_VECTOR_INDEX procedure. Finally, we specify our embedding model “ALL_MINILM_L12_V2” that we imported into our database, with the “database:” prefix. The Select AI with (RAG) announcement blog provides a complete example.
Loading your transformer into the database
To load this transformer into the database, create a directory object or use an existing directory object. Note that Autonomous Database includes a predefined data_pump_dir directory object where you can place files. To create a new directory, you must have the CREATE ANY DIRECTORY system privilege. Using the data pump directory eliminates the need to get this grant and create a new directory.
Then, download the ONNX-format transformer model file (e.g., from object storage in the example below) and import it to the database using DBMS_VECTOR.LOAD_ONNX_MODEL:
CREATE OR REPLACE DIRECTORY ONNX_DIR AS ‘onnx_model’;
BEGIN DBMS_CLOUD.GET_OBJECT( credential_name => NULL, directory_name => 'ONNX_DIR', object_uri => 'https://adwc4pm.objectstorage.us-ashburn-1.oci.customer-oci.com/p/eLddQappgBJ7jNi6Guz9m9LOtYe2u8LWY19GfgU8flFK4N9YgP4kTlrE9Px3pE12/n/adwc4pm/b/OML-Resources/o/all_MiniLM_L12_v2.onnx'); DBMS_VECTOR.LOAD_ONNX_MODEL( directory => 'ONNX_DIR', file_name => 'all_MiniLM_L12_v2.onnx', model_name => 'ALL_MINILM_L12_V2'); END;
You can verify if the model was imported successfully using the following query:
SELECT model_name, algorithm, mining_function FROM user_mining_models WHERE model_name = 'ALL_MINILM_L12_V2';
This documentation provides more information about accessing and creating ONNX-format transformers. Note that Oracle converted the ALL_MiniLM_L12_v2 transformer to ONNX and bundled the necessary pre- and post-processing steps in the file all_MiniLM_L12_v2.onnx, which is hosted in Oracle Cloud Storage.
Why use Select AI RAG?
Select AI RAG makes it easy to use LLMs with semantic similarity search to generate responses based on your private data using natural language prompts. With Select AI RAG, you can get more relevant responses based on up-to-date information, while helping to reduce the risk of hallucination.
By leveraging Select AI with RAG to augment your prompt with your enterprise data, you can bridge the knowledge gap between what the large language model (LLM) knows and the knowledge in your specific content. With “traditional” RAG implementations, you need to manually code the RAG workflow to orchestrate the various components. Select AI RAG simplifies and automates key RAG steps:
- Create and populate a vector index using your documents in object storage
- Augment your prompt with relevant content retrieved through semantic similarity search
- Send this augmented prompt to your specified LLM and return generated results
Note that Select AI RAG not only simplifies the creation of your vector index but also automatically updates it by looking for new files in your bucket through periodic refreshes. The index refresh frequency can be set using the “refresh_rate” attribute in your AI profile.
With the new support for in-database transformers, Select AI RAG keeps the vector generation process local to your database, which eliminates the round trip to an external AI provider with its corresponding potential latency and compute costs.
For more information…
See the following resources:
- Blog: Select AI RAG announcement blog
- Documentation: Use Select AI for Natural Language Interaction with your Database
- Documentation: DBMS_CLOUD_AI package
- Video: Getting Started with Oracle Select AI
- Blog: Now Available! Pre-built Embedding Generation model for Oracle Database 23ai
- Blog: Enhance your AI/ML applications with flexible Bring Your Own Model options
- LiveLab: Chat with Your Data in Autonomous Database Using Generative AI
- LiveLab: Develop apps using GenAI, Autonomous Database, and React