Oracle Database 23ai, version 23.7, introduces built-in support for ONNX pipeline models, including CLIP (Contrastive Language-Image Pre-training) multi-modal embedding models, enabling seamless processing of both image and text inputs within a unified vector space. This advancement moves beyond traditional text-based semantic search, allowing developers to create systems that search through uploaded images using either text descriptions or image queries.

In this blog post, we explore how to use the CLIP multi-modal embedding model, deployed as ONNX models in Oracle Database 23ai, to build a hybrid search application with Oracle APEX. This application enables users to search uploaded images using either text or image inputs. By embedding images and text into a shared vector space, the system efficiently retrieves the most relevant results using vector similarity search, delivering a seamless and intuitive visual search experience.

Note: ONNX models must be augmented through a specialized pipeline to work with Oracle Database 23ai’s AI Vector Search capabilities. This pipeline includes preprocessing steps for both text (tokenization) and images, along with post-processing steps like normalization and pooling. The pipeline allows users to directly input raw text and image data since all preprocessing is internalized, eliminating the need to manually convert inputs to tensors. Starting with OML4Py 2.0, the OML4Py client streamlines this process by augmenting pre-trained transformer models from Hugging Face and converting them to ONNX format. After augmentation, these models can be imported into Oracle Database 23ai to generate embeddings for similarity search.

 

Note: If you are new to semantic search in Oracle APEX, we recommend starting with this foundational blog post which covers text-based vector search. This blog is built on that foundation by adding image support using CLIP.

Hybrid Search with ONNX Multi-Modal Embedding Models

To implement a hybrid search system,  CLIP multi-modal embedding model is leveraged within Oracle Database 23ai and Oracle APEX. The CLIP model accepts text or image inputs and generates embeddings. This approach enables us to search from the uploaded images effectively, offering the following key capabilities:

  • Text Search: When a user enters a textual query, the CLIP text model converts the text into an embedding and compares this text embedding with the embeddings of stored images to find the most semantically relevant content.
  • Image Search: When a user uploads an image, the CLIP image model is used to generate an embedding for the image. This embedding is compared to the stored image embeddings in the database to retrieve visually similar images.
  • Unified Search: Whether a user enters text or uploads an image, the system intelligently compares the corresponding embeddings to the stored content, providing relevant results based on the similarity between the input and the data stored.

Use Case: Hybrid Semantic Search in an Automotive Parts Application

We’ll demonstrate this with an example — APEX Social Media Search app.

Demo of AutoParts Application
Gif 1. Demo of APEX Social Media Search Application

Users can post images to the app and later search through these uploaded images using either:

  • Text Queries (e.g., “tire”).

  • Image Queries (e.g., A picture of a wheel).

[Download the Image Semantic Search App here] – Note: This application requires the CLIP ONNX model to be loaded and functional in Oracle Database 23ai.

Implementation:

Prerequisites

To implement AI Vector Search in Oracle Database 23ai and build Semantic Search capabilities in your Oracle APEX applications, you need to ensure the following prerequisites are met:

  • Oracle Database 23ai (version 23.7 and above)
    AI Vector Search is a feature introduced in Oracle Database 23ai, with initial support starting in version 23.4. However, multi-modal support, essential for this semantic search implementation, was introduced in version 23.7. You must use version 23.7 or higher to enable the vector search capabilities required. 
    • Option: You can provision an Always Free Autonomous Database (ADB) instance running Oracle Database 23ai to meet this requirement at no additional cost. Learn more about availability here.
  • Oracle APEX 24.1 and above
    Oracle APEX version 24.1 or later is required to integrate seamlessly with the AI Vector Search features of Oracle Database 23ai. 

Order of Operations

Architecture of Image Semantic Search
Fig 1. Order of Operations

Step 1: Load CLIP Models 

 Let’s begin by loading the CLIP ONNX models onto the database:

  • clip_txt_model: Processes text inputs to generate embeddings.

  • clip_img_model: Processes image inputs to generate embeddings

Step 2: Create Vector Providers

  • Text Vector Provider:

    Let’s create a vector provider in APEX with the following configurations:
    • Type: Database

    • Modelclip_txt_model

This provider handles text inputs for semantic search.

  • Image Vector Provider:

    Since APEX_AI.GET_VECTOR_EMBEDDINGS accepts only CLOB inputs, a workaround is required to process image (BLOB) data.
    1. Convert BLOB to CLOB: Use apex_web_service.blob2clobbase64() to convert the image BLOB to a Base64-encoded CLOB.

    2. Custom PL/SQL Function: Create a PL/SQL function that:
      • Accepts the Base64-encoded CLOB.
      • Decodes it back to BLOB using apex_web_service.clobbase642blob() (since clip_img_model expects a blob input)
      • Processes the BLOB with clip_img_model to generate the embedding.
    3. Vector Provider: Create a vector provider in APEX with:
      • Type: Custom PL/SQL
      • Function: The custom function created above.

Step 3: Build the APEX Search Page

Design a search page in APEX that allows users to input either text or upload an image. The page should:

  • For Text Input:

    • Use the text vector provider to generate the query embedding. Create a search Configuration referring the text model and create a Search Page.

  • For Image Input:

    • Convert the uploaded image BLOB to a Base64-encoded CLOB.

    • Use APEX_AI.GET_VECTOR_EMBEDDINGS with the image vector provider to generate the query embedding.

The application then compares the query embedding with the precomputed embeddings in sm_posts.ai_image_vector using vector distance metrics (e.g., cosine similarity) to retrieve and display the most relevant results.

Step 4: Precompute and Store Image Embeddings

For each image uploaded (e.g., in the sm_posts table), you compute its vector embedding using clip_img_model. And then, store the resulting embedding in a vector column, such as  sm_posts.ai_image_vector. This pre-computation facilitates efficient similarity searches during query time.

Implementation Steps:

1. Grant Privileges in Database

The database schema needs specific privileges to work with ONNX models:

  • EXECUTE ON DBMS_CLOUD: Allows access to ONNX files in Object Storage.

  • CREATE MINING MODEL: Permits loading and using ONNX models.

Log in as ADMIN using Database Actions and run:

GRANT execute on dbms_cloud To <Your Schema Name>; 
GRANT create mining model TO <Your Schema Name>;

2. Load CLIP ONNX Pipeline Models on Database

First, Extract the preconfigured image model zip files. The zip files will contain two pipelines called clip_vit_base_patch32_txt.onnx and clip_vit_base_patch32_img.onnx, which can be used for image and text embeddings respectively.

Upload the two CLIP ONNX files to Object Storage. Generate Pre-Authenticated Requests (PARs) for both files, then load them into the database:

To do so, first create a credential to access the files in Object Storage:

BEGIN dbms_cloud.create_credential (
          credential_name => 'onnx_obj_store_cred',
          username => '<Your username>',
          password => '<AUTH Token>' );
END;

 

Load the ONNX models from SQL Commands in Oracle APEX:

For the Text model :

BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL(
model_name => 'clip_txt_model',
model_data => DBMS_CLOUD.GET_OBJECT(
                credential_name => 'onnx_obj_store_cred',
                object_uri      => '<Enter Your Object Storage URI>'
             ),
metadata   => JSON('{
    "function": "embedding",
    "embeddingOutput": "embedding",
    "input": { "input": ["data"] }
   }')
);
END;
Load Text Model
Fig 2. Loading the Text model into the database

For the Image model :

BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL(
   model_name => 'clip_img_model',
   model_data => DBMS_CLOUD.GET_OBJECT(
                credential_name => 'onnx_obj_store_cred',   
               object_uri      => '<Enter Your Object Storage URI>'
             ),
metadata   => JSON('{
   "function": "embedding",
    "embeddingOutput": "embedding",
    "input": { "input": ["data"] }
   }')
);
END;
Load Image Model
Fig 3. Loading the Image model into the database

 

3. To search Images using Text query 

  • Create the Vector Provider for Text Model
Create Text Vector Provider
Fig 4. Create Text Vector Provider
  • Create a Search Configurations – Oracle Vector Search
Create Search Configuration
Fig 5. Create Search Configuration for Text Search
Create Search Configuration
Fig 6. Create Search Configuration for Text Search
  • Create a Text Search Page in the application referring the above created Text – Search Configuration.
Create Search Page
Fig 7. Create Search Page

4. For Image to Image query:

  • Create a PL/SQL Function for a custom vector provider:

CREATE OR REPLACE FUNCTION clip_image_model (p_base64_image IN CLOB) RETURN VECTOR IS
l_blob   BLOB;
l_vector VECTOR;
l_params CLOB := '{"provider": "database", "model": "clip_img_model"}';
BEGIN
   l_blob := apex_web_service.clobbase642blob(p_base64_image);
   select to_vector(VECTOR_EMBEDDING(clip_img_model USING l_blob AS data )) into l_vector ;
   RETURN l_vector;
END;
PL/SQL Function for return Embeddings
Fig 8. PL/SQL Function for return embeddings
  • Create a Vector Provider for Image Model:

Create Custom Vector Provider for Image
Fig 9. Create Custom Vector Provider for Image Model
  • Convert Uploaded Image to Vectors Using the ONNX Model

    Generate and store embeddings for existing post images so that search queries run efficiently against precomputed vectors. You can now use the “clip_image_model” Vector provider to convert existing/uploaded images into vectors and store them in the database and use the native function apex_ai.get_vector_embeddings to convert the values in FILE_BLOB into Vectors.

         Add a new column AI_IMAGE_VECTOR to your SM_POSTS table to store the computed vector embeddings.

ALTER TABLE SM_POSTS ADD AI_IMAGE_VECTOR VECTOR;

        Run the following PL/SQL block to convert the image to a vector and update the AI_IMAGE_VECTOR column:

BEGIN
 UPDATE sm_posts
 SET ai_image_vector = apex_ai.get_vector_embeddings (
     p_value => apex_web_service.blob2clobbase64(file_blob),
     p_service_static_id => 'clip_image_model');
END;
  • Hybrid Query:

           Implement a single SQL query that detects a image input, call the Vector Provider, retrieves the vector embeddings and filters result based on the query.

           Source query of the Cards Region :

WITH vector_image AS (
   SELECT apex_ai.get_vector_embeddings (
       p_value   =>  apex_web_service.blob2clobbase64(blob_content),
       p_service_static_id => 'clip_image_model') as img_vector
   FROM apex_application_temp_files
   WHERE filename = :P3_FILENAME
   ORDER BY created_on DESC
   FETCH FIRST 1 ROWS ONLY
   )
   SELECT A.*,
       TO_CHAR(ROUND(vector_distance, 3), '0.999') AS vector_distance_display
   FROM (
   SELECT
       p.id,
       p.post_comment,
       p.file_blob,
       p.file_mime,
       p.file_name,
       p.created,
       p.created_by,
       p.updated,
       p.updated_by,
       CASE
           WHEN EXISTS (SELECT 1 FROM vector_image) THEN
               COSINE_DISTANCE(
                   (SELECT img_vector FROM vector_image),
                   p.ai_image_vector
               )
           ELSE null
       END AS vector_distance
   FROM sm_posts p
   ) A
   ORDER BY A.vector_distance, A.created DESC;

Summary: A Seamless Hybrid Search Experience

Oracle Database 23ai (version 23.7) introduces built-in support for ONNX pipeline models, including CLIP multi-modal embedding models, enabling hybrid search systems that process both text and image inputs in a unified vector space. By generating embeddings with CLIP, we achieve accurate image-text similarity comparisons, improving search precision and user experience. Integrating these models with Oracle APEX simplifies content discovery, allowing users to search seamlessly using embeddings generated by CLIP multi-modal embedding models.

For more details on generating embeddings from CLIP multi-modal embedding models in Oracle Database, refer to the official documentation: ONNX Pipeline Models for CLIP Multi-Modal Embedding.

References: