Simplify Embedding Model Conversion with the OML4Py Client Container

The OML4Py client container gives you a ready-to-use environment for preparing Hugging Face embedding models for use with Oracle AI Vector Search and Oracle Select AI. Instead of building the environment yourself, you pull a single prebuilt image that has everything needed to convert relevant models into the ONNX format that AI Vector Search uses to generate embeddings.

These embedding models convert text into numerical vectors that capture meaning. Many AI applications rely on them for semantic similarity search, which finds results by semantic meaning rather than by keyword search.

Why a Container?

Installing and configuring the OML4Py client used to mean building Python from source, pinning a set of required third-party packages to exact versions, installing the Oracle Instant Client, and setting environment variables. Every new environment meant repeating those steps, and the process assumed some experience configuring a Python environment. The OML4Py client container collapses that work into a single step. What used to be a multi-step installation becomes a single Podman pull.

In an earlier post, we introduced how the OML4Py 2.1.1 client converts pretrained transformer models from Hugging Face to ONNX format for use with Oracle AI Vector Search. It augments a model with the tokenization and pooling steps that turn text into embeddings, then loads it into the database so you can search by semantic meaning rather than exact keywords, without moving data to a separate vector database. Select AI also leverages these embeddings, using them for retrieval-augmented generation (RAG), matching user questions to relevant internal documents, and for NL2SQL Feedback, where feedback is stored in a vector index used to refine future SQL generation.

What’s Inside the Container?

Instead of assembling the environment yourself, you pull one image based on Oracle Linux 8 with Python 3.13.5 and the required dependencies already installed and verified to work together, including:

oracledb 3.3.0
pandas 2.2.3
numpy 2.1.0
scikit-learn 1.6.1
scipy 1.14.1
matplotlib 3.10.4
torch 2.10.0
onnxruntime 1.20.0
transformers 4.56.1

This means no dependency conflicts to resolve, a reproducible setup across machines, and faster onboarding. From there, you can start converting models immediately.

Deployment Options

Where you deploy the converted model depends on where your data and applications reside. The model can be loaded into the database, and can also be leveraged by the Oracle Private AI Services Container to generate vector embeddings outside the database through an OpenAI-compatible REST API. The OML4Py container is the producer in the workflow. You convert a model once, then deploy it to your preferred target.

	Oracle AI Database 26ai using in-database ONNX Runtime	Private AI Services Container
Where embeddings are generated	Inside the database accessible as a first-class database object	Outside the database via REST API
Best when	Your data already resides in the database and embedding compute demands are within acceptable ranges relative to other database workloads	Embedding generation would compete with other database compute demands Applications external to the database need embeddings on demand
Key benefits	No data movement; combine embeddings with SQL in a single query	Moves embedding computation off the core database into a near-database container Serves multiple consumers independently

Get Started

To walk through the complete setup, see this technical guide. It takes you from installing Podman through generating embeddings in the database, with every command and the expected output.

Install Podman and pull the container image from the Oracle Container Registry (including an air-gapped option)
Launch the container, with or without an Oracle wallet
Convert a preconfigured Hugging Face model to ONNX and load it into the database, either directly from Python or by exporting a file and loading it with DBMS_VECTOR.LOAD_ONNX_MODEL
Verify the model is loaded and generating your first embedding.
Copy the converted model out of the container for use with the Private AI Services Container

The OML4Py client container takes the setup work out of converting embedding models, enabling you to convert a Hugging Face model for use with AI Vector Search in a single prebuilt environment. Because the same converted model runs either inside the database or through the Private AI Services Container, you stay flexible in how and where you deploy it.

For more information on the OML4Py 2.1.1 container, refer to the documentation. For background on the Private AI Services Container, see the announcement post. For a step-by-step walkthrough of installing and configuring the container, see Getting Started with Private AI Services Container.

Simplify Embedding Model Conversion with the OML4Py Client Container

Why a Container?

What’s Inside the Container?

Deployment Options

Get Started

Resources

Sherry LaMonica

Lead Principal Product Manager, AI and Machine Learning

Connecting Codex to Enterprise Data with Oracle Autonomous AI Database MCP Server

Build Machine Learning Pipelines Without Code using Oracle Machine Learning Workflow

Simplify Embedding Model Conversion with the OML4Py Client Container

Why a Container?

What’s Inside the Container?

Deployment Options

Get Started

Resources

Authors

Sherry LaMonica

Lead Principal Product Manager, AI and Machine Learning

Connecting Codex to Enterprise Data with Oracle Autonomous AI Database MCP Server

Build Machine Learning Pipelines Without Code using Oracle Machine Learning Workflow