Oracle Database 23ai introduces Oracle AI Vector Search, revolutionizing semantic similarity search by generating vectors using transformer models and managing them at scale within the database. This empowers users to find relevant information based on meaning and context, eliminating the pain point of transferring data to separate vector databases, thus reducing complexity and operational overhead.
Starting with OML4Py 2.0, the client converts pre-trained transformer models from Hugging Face to the Open Neural Network Exchange (ONNX) format for AI Vector Search. Hugging Face is an open-source Python library for deep learning, offering thousands of pre-trained models for natural language processing (NLP), computer vision, audio, and more. ONNX is an open format for representing various machine learning models, including transformers, classification, regression, and other types.
The OML4Py client streamlines the path to embedding generation in the database by downloading pre-trained models from the Hugging Face repository, augmenting them with pre- and post-processing steps, converting the augmented model to ONNX format, and loading it into an in-database model or exporting it to a file. Once loaded in ONNX format, you can seamlessly use the in-database ONNX Runtime to generate vector embeddings for AI Vector Search.
Combined with the OML4Py client, AI Vector Search offers a comprehensive solution for leveraging vector embeddings in Oracle Database, enhancing text data analysis, search capabilities, and machine learning applications.
In this blog, we outline the steps to install the OML4Py client, provide examples of converting pre-configured and customized models to ONNX format, and load the models into Oracle Database. This applies to Oracle Database on-premises or on the cloud and Autonomous Database.
The OML4Py 2.0 client is supported on Linux 8 64-bit systems and is supported for Oracle Autonomous Database Serverless and Oracle Databases. These instructions assume you have configured your OL8 repo in /etc/yum.repos.d, configured a Wallet if using Autonomous Database, and a proxy is set if needed.
1. Install Python
sudo yum install libffi-devel openssl openssl-devel tk-devel xz-devel zlib-devel bzip2-devel readline-devel libuuid-devel ncurses-devel libaio
mkdir -p $HOME/python
wget https://www.python.org/ftp/python/3.12.0/Python-3.12.0.tgz
tar -xvzf Python-3.12.0.tgz --strip-components=1 -C /home/$USER/python
cd $HOME/python
./configure --prefix=$HOME/python
make clean; make
make altinstall
2. Set variables for Python: PYTHONHOME, PATH, and LD_LIBRARY_PATH
export PYTHONHOME=$HOME/python
export PATH=$PYTHONHOME/bin:$PATH
export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH
3. Create symlinks for python3 and pip3
cd $HOME/python/bin
ln -s python3.12 python3
ln -s pip3.12 pip3
4. Install the Oracle Instant client if you will be loading ONNX models to the database from Python. If you will be exporting to a file and using DBMS_VECTOR.LOAD_ONNX_MODEL to load the model to the database, skip this step and step 5 and see the note under environment variables in step 6.
cd $HOME
wget https://download.oracle.com/otn_software/linux/instantclient/2340000/instantclient-basic-linux.x64-23.4.0.24.05.zip
unzip instantclient-basic-linux.x64-23.4.0.24.05.zip
5. Set variable LD_LIBRARY_PATH for the Oracle Instant Client
export LD_LIBRARY_PATH=$HOME/instantclient_23_4:$LD_LIBRARY_PATH
6. Create an environment file such as env.sh that defines the Python and Oracle Instant client environment variables and source these environment variables prior to each OML4Py client session. Alternatively, add the environment variable definitions to .bashrc so they are defined when the user logs into their Linux machine.
# Environment variables for Python
export PYTHONHOME=$HOME/python
export PATH=$PYTHONHOME/bin:$PATH
export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH
# Note: If Python is used to load models to the database, set this environment variable for the Oracle Instant Client.
export LD_LIBRARY_PATH=$HOME/instantclient_23_4:$LD_LIBRARY_PATH
7. Create a file named requirements.txt containing the required third-party packages.
--extra-index-url https://download.pytorch.org/whl/cpu
pandas==2.1.1
setuptools==68.0.0
scipy==1.12.0
matplotlib==3.8.4
oracledb==2.2.0
scikit-learn==1.4.1post1
numpy==1.26.4
onnxruntime==1.17.0
onnxruntime-extensions==0.10.1
onnx==1.16.0
torch==2.2.0+cpu
transformers==4.38.1
sentencepiece==0.2.0
8. Upgrade pip3 and install packages in requirements.txt
pip3 install --upgrade pip
pip3 install -r requirements.txt
9. Install the OML4Py client
Download the OML4Py 2.0 client and upload to your Linux client.
unzip oml4py-client-linux-x86_64-2.0.zip
pip3 install client/oml-2.0-cp312-cp312-linux_x86_64.whl
10. Start Python and import the oml package
$ python3
Python 3.12.0 (main, May 28 2024, 16:09:11) [GCC 8.5.0 20210514 (Red Hat 8.5.0-20.0.3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import oml
The OML4Py client provides built-in property configurations, referred to as templates, that contain common configurations for text-based models. These configurations specify pre-and post-processing steps as well as supported language and applicable distance metrics. The template contains a default configuration for the pretrained models, and it can be modified to create custom configurations for specific use cases. OML4Py easily converts these models into ONNX format, managing all the essential tokenization and post-processing steps with ease. The resulting ONNX pipeline is then loaded to your Oracle Database 23ai instance from Python, or by exporting the model to a file and then loading it to the database using the DBMS_VECTOR.LOAD_ONNX_MODEL procedure.
Import the embedding model classes and show the available preconfigured models.
from oml.utils import EmbeddingModel, EmbeddingModelConfig
EmbeddingModelConfig.show_preconfigured()
['sentence-transformers/all-mpnet-base-v2',
'sentence-transformers/all-MiniLM-L6-v2',
'sentence-transformers/multi-qa-MiniLM-L6-cos-v1',
'ProsusAI/finbert',
'medicalai/ClinicalBERT',
'sentence-transformers/distiluse-base-multilingual-cased-v2',
'sentence-transformers/all-MiniLM-L12-v2',
'BAAI/bge-small-en-v1.5',
'BAAI/bge-base-en-v1.5',
'taylorAI/bge-micro-v2',
'intfloat/e5-small-v2',
'intfloat/e5-base-v2',
'prajjwal1/bert-tiny',
'thenlper/gte-base',
'thenlper/gte-small',
'TaylorAI/gte-tiny',
'infgrad/stella-base-en-v2',
'sentence-transformers/paraphrase-multilingual-mpnet-base-v2',
'intfloat/multilingual-e5-base',
'intfloat/multilingual-e5-small',
'sentence-transformers/stsb-xlm-r-multilingual']
More information on the models in this list can be found at the HuggingFace website. For example, the all-MiniLM-L6-v2 model is part of the sentence-transformers library. This model takes sentences or paragraphs and converts them into 384-dimensional vectors, referred to as embeddings.
Each of these 384 dimensions captures a specific aspect of the sentence's meaning or characteristics. By representing sentences in this dense vector space, we can perform various tasks such as clustering similar sentences or conducting semantic searches, where we look for sentences that have similar meanings.
Using Python, invoke export2file() to generate an ONNX file from the preconfigured all-MiniLM-L6-v2 model.
em = EmbeddingModel(model_name="sentence-transformers/all-MiniLM-L6-v2")
em.export2file("/tmp/models/all-MiniLM-L6", output_dir=".")
This exports the ONNX-format model to your local file system:
ls -la all-MiniLM-L6.onnx
-rw-rw-r--. 1 opc opc 90621438 May 28 16:36 all-MiniLM-L6.onnx
Move the ONNX file to a directory on the database server, and create a directory on the file system and in the database for the import. Apply the necessary permissions and grants. In this example, we are using a pluggable database named ORCLPDB.
mkdir -p /tmp/models
sqlplus / as sysdba
alter session set container=ORCLPDB;
-- directory to store ONNX files for import
CREATE DIRECTORY ONNX_IMPORT AS '/tmp/models';
-- grant your OML user read and write permissions on the directory
GRANT READ, WRITE ON DIRECTORY ONNX_IMPORT to OMLUSER;
-- grant to allow user to import the model
GRANT CREATE MINING MODEL TO OMLUSER;
Use the DBMS_VECTOR.LOAD_ONNX_MODEL procedure to load the model in your OML user schema. In this example, the procedure loads the ONNX model file named all-MiniLM-L6.onnx model file from the ONNX_IMPORT directory into the database as a model named ALL_MINILM_L6.
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL(
directory => 'ONNX_IMPORT',
file_name => 'all-MiniLM-L6.onnx',
model_name => 'ALL_MINILM_L6');
END;
PL/SQL procedure successfully completed.
Verify the model exists in the database.
SELECT MODEL_NAME, ALGORITHM, MINING_FUNCTION
FROM USER_MINING_MODELS
WHERE MODEL_NAME='ALL_MINILM_L6';
--------------------------------------------------------------------------
MODEL_NAME ALGORITHM MINING_FUNCTION
--------------------------------------------------------------------------
ALL_MINILM_L6 ONNX EMBEDDING
Implicitly available with each model are associated model detail views, which serve as system-provided documentation. These include the model settings and other content that can be algorithm specific. For model ALL_MINILM_L6, we can obtain the set of model views using the following query:
SELECT VIEW_NAME, VIEW_TYPE
FROM USER_MINING_MODEL_VIEWS
WHERE MODEL_NAME='ALL_MINILM_L6'
ORDER BY VIEW_NAME;
----------------------------------------------------
VIEW_NAME VIEW_TYPE
-----------------------------------------------------
DM$VJALL_MINILM_L6 ONNX Metadata Information
DM$VMALL_MINILM_L6 ONNX Model Information
DM$VPALL_MINILM_L6 ONNX Parsing Information
Each model view has a unique name. "DM$V" is the model view prefix, followed by a letter indicating the type of information it contains (e.g.,J=JSON, M=Metadata, P=parsed JSON) followed by the model name.
Querying the ONNX Model Information view, we see the name of the tool that produced the ONNX model, the graph name and version, the model input mapping, and the output vector information with the dimension and type.
SELECT * FROM DM$VMALL_MINILM_L6;
--------------------------------------------------------------------------
NAME VALUE
--------------------------------------------------------------------------
Producer Name onnx.compose.merge_models
Graph Name tokenizer_main_graph
Graph Description Graph combining tokenizer and main_graph tokenizer
main_graph Version 1
Input[0] input:string[?]
Output[0] embedding:float32[?,384]
6 rows selected.
Generate embedding vectors using the VECTOR_EMBEDDING SQL scoring function. Partial output is shown below.
SELECT VECTOR_EMBEDDING(ALL_MINILM_L6 USING 'RES' as DATA) AS embedding;
EMBEDDING
--------------------------------------------------------------------------------
[-1.16423041E-001,1.54331913E-002,-4.69262972E-002,7.1672732E-003,3.5023436E-002
,-4.02988493E-002,1.84127204E-002,6.48672804E-002,1.49925603E-002,1.20538445E-00
2,-3.54219265E-002,-9.51012298E-002,-2.08232403E-002,4.99225557E-002,-1.86311286
E-002,-2.62796637E-002,-3.2601878E-002,5.22731915E-002,-9.84884799E-002,9.467907
95E-004,4.39537093E-002,1.06041664E-002,2.93337256E-002,-4.25205268E-002,-2.5767
...
...
An alternative to using export2file() with DBMS_VECTOR.LOAD_ONNX_MODEL is to load the model directly to the database from Python. This step requires the installation of the Oracle Instant client as described in the OML4Py client installation section.
Import the oml library and EmbeddingModel from oml.utils
import oml
from oml.utils import EmbeddingModel
Set embedded mode to false for Oracle Database on premises. Note, this is not supported or required for Oracle Autonomous Database.
oml.core.methods.__embed__ = False
Create a database connection. Update the parameters below to match your credentials and database environment.
# Oracle Database
oml.connect("<user>", "<password>", port=<port number> host="<hostname>", service_name="<service name>")
# Oracle Autonomous Database
oml.connect(user="<user>", password="<password>", dsn="myadb_low")
Load the preconfigured model to the database. (Note, this step is equivalent to using DBMS_VECTOR.LOAD_ONNX_MODEL).
em = EmbeddingModel(model_name="sentence-transformers/all-MiniLM-L6-v2")
em.export2db("ALL_MINILM_L6")
Once imported, the model and its views can be queried as shown above, and you can generate embeddings from Python or SQL:
import oracledb
cr = oml.cursor()
data = cr.execute("select vector_embedding(ALL_MINILM_L6 using 'RES' as DATA)AS embedding from dual")
data.fetchall()
[(array('f', [-0.11642304062843323, 0.015433191321790218, -0.04692629724740982, 0.007167273201048374,
0.035023435950279236, -0.04029884934425354, 0.018412720412015915, 0.0648672804236412, 0.014992560259997845, 0.012053844518959522, -0.03542192652821541, -0.09510122984647751, -0.020823240280151367, 0.049922555685043335, -0.01863112859427929, -0.026279663667082787, -0.03260187804698944, 0.05227319151163101, -0.09848847985267639, 0.0009467907948419452, 0.04395370930433273, 0.01060416642576456, 0.029333725571632385, -0.04252052679657936, -0.025767622515559196, -0.03291202709078789, -0.053170595318078995, 0.07415973395109177, 0.0082617262378335,
...
...
SELECT VECTOR_EMBEDDING(ALL_MINILM_L6 USING 'RES' as DATA) AS embedding;
EMBEDDING
--------------------------------------------------------------------------------
[-1.16423041E-001,1.54331913E-002,-4.69262972E-002,7.1672732E-003,3.5023436E-002
,-4.02988493E-002,1.84127204E-002,6.48672804E-002,1.49925603E-002,1.20538445E-00
2,-3.54219265E-002,-9.51012298E-002,-2.08232403E-002,4.99225557E-002,-1.86311286
E-002,-2.62796637E-002,-3.2601878E-002,5.22731915E-002,-9.84884799E-002,9.467907
95E-004,4.39537093E-002,1.06041664E-002,2.93337256E-002,-4.25205268E-002,-2.5767
...
...
In addition to the default configuration, users also have the flexibility to create a custom configuration using a text template provided with OML4Py. The custom configuration is provided in the config parameter as shown below. In this case, we use a maximum of 256 tokens as input to the model and the Cosine similarity metric for the vector distance operator. The maximum model size is 1 GB, and quantizing reduces its memory footprint to under 1 GB.
from oml.utils import EmbeddingModel, EmbeddingModelConfig
config = EmbeddingModelConfig.from_template("text", max_seq_length=256, distance_metrics=["COSINE"], quantize_model=True)
em = EmbeddingModel(model_name="sentence-transformers/all-MiniLM-L6-v2", config=config)
em.export2db("ALL_MINILM_L6")
Refer to the Properties table in the OML4Py User's Guide for a complete list of properties that can be defined with custom configurations. Note that preconfigured models already have these properties set to specific values.
The embeddings produced by VECTOR_EMBEDDING can be stored in a database table to enable AI Vector Search. It seamlessly integrates with the Oracle ecosystem, leveraging the database's capabilities for data management, security, and scalability.
For more information, see:
Oracle Machine Learning
Next Post