Vector‑native RAG on Oracle: embeddings, HNSW/IVF, and hybrid search under database governance

This is article 6 of 8 in my Oracle AI Database Skills series.

Key Takeaways

Storing vectors in an Oracle VECTOR column alongside content, metadata, and provenance means retrieval happens inside the database. Existing governance — row-level security, auditing, data masking — applies to vector queries the same way it applies to any other query.
Hybrid retrieval is ordinary SQL: VECTOR_DISTANCE handles semantic similarity and WHERE clauses handle business predicates in the same statement. Any reviewer who can read a query can understand what rows qualified and why.
HNSW and IVF are index strategies with real trade-offs in recall, memory footprint, and query latency. Choosing between them requires measuring both on your own corpus — they are not interchangeable defaults.
The answer step can stay inspectable too. Connecting retrieval to a Select AI profile with SHOWSQL means the SQL generation step is reviewable, not just the retrieval that fed it.

Retrieval‑augmented generation works when you can fetch the right context quickly and explain—in plain SQL—why those rows qualified. Splitting storage between a stand‑alone vector store and the database of record speeds early prototypes but makes governance fragile: metadata is duplicated, filters are reimplemented, and audit trails often end at a gateway you do not control. Oracle’s vector‑native approach keeps the entire RAG path inside the database. Embeddings live in a VECTOR column next to the text, JSON, and provenance you already track; you index them with HNSW or IVF using CREATE VECTOR INDEX; you retrieve with SQL that combines semantic similarity and business predicates. Because retrieval runs on database objects, existing policies and audits—when configured on those objects—stay in the path. When you need an LLM to compose the answer, Select AI with AI Profiles lets you scope what the model can see and inspect generated SQL before execution. The result is a retrieval path that is fast enough for interactive use and reviewable under the same governance you already operate.

Route → act → trust for in‑database RAG. Retrieval is SQL; governance stays in the path.

If you read Article 5, you saw how AI Profiles keep NL2SQL inspectable. This article focuses on retrieval: how to store vectors, choose HNSW or IVF, and write hybrid queries you can defend in a review meeting.

Prerequisites

Oracle AI Database 26ai or Autonomous Database with AI Vector Search enabled for your RU/service tier.
Privileges to CREATE TABLE and CREATE INDEX in a schema with sufficient quota.
For external embeddings or Select AI, configure credentials and network ACLs per your organization’s policy.
Verify CREATE VECTOR INDEX options and the DBMS_* package signatures in your RU before automating builds.

What “vector‑native” means in Oracle

Vector‑native means embeddings are first‑class database data, not a sidecar. You add a VECTOR(dim, element_type) column to the table that holds your chunked content and provenance; you build an approximate nearest‑neighbor (ANN) index—HNSW for interactive recall or IVF for very large corpora—and you query with SQL. The practical effect is governance continuity: when Virtual Private Database (VPD) policies, Unified Auditing, Data Redaction, and Transparent Data Encryption (TDE) are configured on the relevant objects, they continue to filter, record, mask, and protect your retrieval path. When someone asks why a row entered the context window, you can point to the same table‑scoped policies that govern every other query.

Diagram titled “Oracle Vector Objects” showing a DOCUMENTS table with fields for ID, source, published date, content, and embedding vectors. The table connects to two Oracle vector indexes: DOCS_HNSW_IDX using an in-memory neighbor graph with cosine distance, and DOCS_IVF_IDX using neighbor partitions with cosine distance. A note references business predicates for filtering and retrieval. — Oracle Database vector objects with HNSW and IVF indexes for document embedding retrieval

For architecture and security details, see the Oracle AI Vector Search overview and the Database Security Guide for VPD, Unified Auditing, Data Redaction, and TDE:

From document to chunk to embedding

RAG quality depends heavily on two choices you control: where you cut chunks and how you embed them. Oracle provides DBMS_VECTOR_CHAIN utilities for deterministic chunking with tunable size and overlap, so the same document splits the same way as you iterate. For embeddings, you can run an in‑database model (see DBMS_VECTOR support for installed models and ONNX where documented for your RU) or call an external provider with credentials stored in the database and governed by network ACLs. Treat those decisions as versioned preferences you can update deliberately.

Keep the storage shape simple: one row per chunk with the content, the provenance you already carry (source, timestamps, tenant), and a VECTOR column for the embedding. That single‑row design makes hybrid retrieval straightforward because eligibility is a WHERE clause and semantic similarity is an ORDER BY, all in one statement.

Docs:

DBMS_VECTOR (embedding/model support)
DBMS_VECTOR_CHAIN (chunking and embedding helpers)

DBMS_HYBRID_VECTOR: keyword + semantic in one call

DBMS_VECTOR_CHAIN and DBMS_VECTOR handle pure vector workflows. When your retrieval also needs full‑text keyword matching – relevance ranking by term frequency, Oracle Text lexer rules, or multi‑language tokenization – DBMS_HYBRID_VECTOR adds that layer. The package provides a JSON‑based SEARCH API that searches by keywords and vectors against hybrid vector indexes. By integrating traditional keyword‑based text search with vector‑based similarity search in a single call, it can improve recall for queries where either approach alone misses relevant results.

Use DBMS_HYBRID_VECTOR when:

Your corpus has mixed-quality writing (exact keywords matter for short queries, semantic similarity matters for paraphrased ones).
You support multiple languages with different stemming rules that Oracle Text already handles.
You want a single managed search API rather than hand-tuned SQL merging of separate text and vector result sets.

For pure vector retrieval with business-predicate filters, the WHERE + VECTOR_DISTANCE pattern in the section below is simpler and sufficient. Reserve DBMS_HYBRID_VECTOR for the cases where keyword recall meaningfully improves results on your corpus—measure both before committing to the more complex path.

Doc:

DBMS_HYBRID_VECTOR

Choosing an index: HNSW or IVF

HNSW and IVF both deliver sub‑second nearest‑neighbor search, but they trade memory, build time, and recall differently.

HNSW builds an in‑memory navigable small‑world graph. In practice, it often yields high recall at low latency for interactive workloads, at the cost of a memory footprint that grows with corpus size and neighbor degree. Use the documented memory and accuracy advisors in DBMS_VECTOR to size the index pool and validate recall on your corpus before you build. If you are powering a help center, runbooks, or an internal knowledge base where queries arrive piecemeal and you value accuracy, HNSW is a common starting point – measure and tune for your data.

IVF clusters the vector space into partitions and probes only a subset per query. That design can lower memory footprint and sustain good throughput on very large corpora. You trade some recall and tune it by changing how many partitions to probe. If you manage millions of chunks and steady traffic, evaluate IVF operationally using Oracle’s accuracy and performance advisors.

One optimizer detail matters to plans and expectations: the optimizer uses a vector index only when your query’s distance metric matches the index’s metric. If they differ, the index will not be used and an exact search is performed. If your embeddings use cosine similarity (common for text), create the index with DISTANCE COSINE and use the same metric in ORDER BY.

Diagram titled “Choosing a Vector Index” comparing HNSW and IVF vector indexing methods. The HNSW section highlights high recall with low latency, memory scaling with graph settings, suitability for dynamic top-K retrieval, and DBMS_VECTOR advisors. The IVF section describes centroid-based partitions, probe tuning for recall and latency, and suitability for large stable corpora. A note emphasizes matching index distance metrics to the query metric, such as cosine similarity. — Comparison of HNSW and IVF vector indexes for Oracle Database vector search

Docs:

CREATE VECTOR INDEX (HNSW/IVF)
DBMS_VECTOR advisors and accuracy helpers

Writing hybrid retrieval you can defend

Hybrid search in Oracle is just SQL. Your business predicates – tenant, source system, publish date, access level—are ordinary filters in WHERE, and VECTOR_DISTANCE ranks eligible rows by semantic similarity. Because this happens on a single table with a single row shape, the execution plan can combine a vector index for similarity with conventional indexes for filters. When configured on the relevant objects, VPD policies still apply to the session.

A typical query looks like this:

SELECT id, source, content
FROM   documents
WHERE  source = 'kb'
AND    published >= DATE '2025-01-01'
ORDER  BY VECTOR_DISTANCE(embedding, :qvec, COSINE)
FETCH  FIRST 5 ROWS ONLY;

Ensure the index’s DISTANCE metric matches the query’s metric; otherwise the vector index may not be used.

Diagram titled “Hybrid Retrieval You Can Review.” A query vector feeds into retrieval SQL that uses vector distance ranking and metadata filters. The query accesses a vector index (HNSW/IVF) to return top-K rows with provenance. EXPLAIN PLAN and SHOWSQL provide visibility into the retrieval process for inspection and review. — Reviewable hybrid retrieval workflow using Oracle Database vector search and explainable SQL

Two small details avoid surprises in production. VECTOR_DISTANCE returns smaller values for closer vectors, so you sort ascending and keep the metric explicit to match the index. Also, the query vector’s dimension and element type must match the column’s or the statement will error. In plans, you will often see the vector index satisfy similarity ordering while B‑trees or partitions satisfy filters; the optimizer’s choice depends on predicate selectivity, statistics, and metric/index alignment.

If you prefer operator syntax, the <=> operator is the cosine distance operator in 26ai. Use it only if your target version documents it; otherwise prefer VECTOR_DISTANCE(..., COSINE) explicitly:

SELECT id, source, content
FROM   documents
ORDER  BY embedding <=> :qvec
FETCH  FIRST 3 ROWS ONLY;

Docs:

A tiny end‑to‑end demo

This toy example shows the loop from rows to top‑K in a few statements. It runs on Oracle AI Database 26ai or Autonomous Database. Replace the three‑dimensional vectors with your model’s actual dimension (384/768/1536 are common), and use FLOAT32 for typical embeddings. For real data, prefer CLOB when content exceeds 4000 bytes.

1) Create a table that keeps content, provenance, and a vector together:

CREATE TABLE documents (
  id         NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  source     VARCHAR2(64),
  published  DATE,
  content    VARCHAR2(4000), -- use CLOB for larger text in production
  embedding  VECTOR(3, FLOAT32)
);

2) Insert a few rows with deterministic toy vectors so results are obvious:

INSERT INTO documents (source, published, content, embedding) VALUES
  ('kb', DATE '2025-01-15', 'How to reset your password',
   TO_VECTOR('[0.10, 0.05, 0.90]', 3, FLOAT32));

INSERT INTO documents (source, published, content, embedding) VALUES
  ('kb', DATE '2025-02-10', 'How to export monthly invoices',
   TO_VECTOR('[0.80, 0.10, 0.10]', 3, FLOAT32));

INSERT INTO documents (source, published, content, embedding) VALUES
  ('runbook', DATE '2025-03-05', 'Rotate API keys every 90 days',
   TO_VECTOR('[0.15, 0.85, 0.10]', 3, FLOAT32));
COMMIT;

3) Create a cosine HNSW index and keep the metric explicit:

CREATE VECTOR INDEX docs_hnsw_idx
ON documents (embedding)
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE;

The ORGANIZATION clause names the HNSW structure as documented in the SQL Reference. Verify feature availability and any licensing considerations in your environment, and keep the query metric aligned (COSINE here) for the optimizer to consider this index.

4) Run a top‑K query. VECTOR_DISTANCE returns smaller values for closer matches, so use ascending ORDER BY. Using a literal vector for illustration (in applications, pass vectors as binds rather than literals to avoid hard parsing and ensure type/dimension alignment):

SELECT id, source, content
FROM   documents
ORDER  BY VECTOR_DISTANCE(
           embedding,
           TO_VECTOR('[0.12, 0.04, 0.92]', 3, FLOAT32),
           COSINE
         )
FETCH  FIRST 3 ROWS ONLY;

Using a bind (recommended in applications). Ensure `:qvec` matches the column’s dimension and element type:

-- Suppose :qvec is a VECTOR(3, FLOAT32) bind variable
SELECT id, source, content
FROM   documents
ORDER  BY VECTOR_DISTANCE(embedding, :qvec, COSINE)
FETCH  FIRST 3 ROWS ONLY;

Operator shorthand (if documented in your RU; <=> is cosine):

SELECT id, source, content
FROM   documents
ORDER  BY embedding <=> :qvec
FETCH  FIRST 3 ROWS ONLY;

5) Add a business predicate to show hybrid eligibility in action:

SELECT id, source, content
FROM   documents
WHERE  source = 'kb'
ORDER  BY VECTOR_DISTANCE(
           embedding,
           TO_VECTOR('[0.12, 0.04, 0.92]', 3, FLOAT32),
           COSINE
         )
FETCH  FIRST 2 ROWS ONLY;

If your corpus grows very large, evaluate IVF. Drop the HNSW index first. Keep a single active vector index per column to simplify planning and maintenance.

Use the documented IVF clause (NEIGHBOR PARTITIONS) in your RU’s SQL Reference:

DROP INDEX docs_hnsw_idx;

CREATE VECTOR INDEX docs_ivf_idx
ON documents (embedding)
ORGANIZATION NEIGHBOR PARTITIONS
DISTANCE COSINE;

After creating IVF, verify search‑effort controls and probe counts for your RU in the SQL Reference and DBMS_VECTOR docs; use those documented knobs to balance recall and latency.

Reviewable Oracle AI answer workflow with visible SQL, provenance, and governed data access

Orchestrating the answer: in your app or with Select AI

Retrieval yields top‑K rows with provenance; you still need to compose an answer. Many teams keep this step in the application: run the retrieval SQL, log the query and bind values, and pass the returned snippets—plus id, source, and published—to an LLM. Because the context is explicit, reviewers can reconstruct what the model saw and sign off on the path.

If you prefer to orchestrate inside the database, use Select AI with a narrowly scoped AI Profile. During early review, enable SHOWSQL/EXPLAINSQL so a human can inspect generated SQL before execution. Syntax for USING/profile attributes varies by RU; follow the Select AI examples for your environment:

-- In a session with an AI Profile that scopes access
SELECT AI SHOWSQL 'List the top 3 KB articles about password resets from 2025.'
USING 'profile = <your_ai_profile>';

To prevent table data from being sent to a provider during review, use the data‑access controls documented for your RU/service tier (see Select AI examples). Enable the appropriate session/profile setting rather than relying on defaults.

Select AI can include sources in responses when configured; check the SOURCES option and relevant profile attributes for your RU, and keep SHOWSQL/EXPLAINSQL enabled until stakeholders are comfortable with the generated SQL.

Docs:

Select AI examples (SHOWSQL, EXPLAINSQL, object scoping, data access)

Operating in production shape

Demos are short; production is measurement and repeatability. For HNSW, memory depends on graph parameters and corpus size. Use DBMS_VECTOR memory advisors before large builds so you can size the index pool without starving the buffer cache, and measure recall and latency as you adjust graph parameters. For IVF, recall depends on partitioning and probe counts. Use accuracy helpers to compare index settings against exact search on a representative sample, and decide whether a point of recall is worth the latency trade.

Plan hygiene matters. Keep the query metric aligned with the index metric, and gather statistics after bulk loads so the optimizer can choose the intended plan. A minimal stats call looks like this:

BEGIN
  DBMS_STATS.GATHER_TABLE_STATS(USER, 'DOCUMENTS');
END;
/

BEGIN
  DBMS_STATS.GATHER_INDEX_STATS(USER, 'DOCS_HNSW_IDX');
END;
/

Gather index stats after index builds or rebuilds so the optimizer can cost similarity ordering accurately.

If you evolve index settings or dimensions, make DDL idempotent and, where your release supports it, prefer online rebuilds to avoid unnecessary outages. Article_07 covers schema and index evolution strategies.

Governance continuity does not require special workarounds. When policies are configured on the relevant objects, VPD continues to filter rows, Unified Auditing can capture activity, and Data Redaction can mask columns that should not appear in clear text. As with any sensitive path, set MODULE and ACTION at session start, run your retrieval, and make evidence easy to find in audit views. Consider tagging vector queries with a distinct ACTION (for example, RAG_RETRIEVAL) to simplify Unified Auditing filters.

Docs:

DBMS_VECTOR advisors and accuracy reporting
Security guide landing (VPD, Unified Auditing, Data Redaction, TDE)

Version scope and environment notes

The VECTOR data type, VECTOR_DISTANCE, ANN vector indexes (HNSW/IVF via CREATE VECTOR INDEX), DBMS_VECTOR, DBMS_VECTOR_CHAIN, and the Select AI features shown here require Oracle AI Database 26ai or Autonomous Database. If you are on 19c, you can persist arrays as JSON or BLOB and compute distances in the application, but you do not have native vector columns, vector indexes, or VECTOR_DISTANCE. Always verify your exact RU or service tier for option names, index clauses, and package signatures before you automate builds. If you use external embedding providers, configure credentials and network ACLs before calling them from the database. Ensure the query vector dimension and element type match the column definition; otherwise the statement will error.

Out of scope

This article stays focused on storing vectors, building the two ANN index types, writing hybrid retrieval SQL, and keeping the answer path reviewable. It does not cover Oracle Text, summarization prompts, or application frameworks. For NL2SQL hardening and AI Profile design, see Article 5. For safe schema and index evolution, see Article 7.

Try it next

Load 50–100 real chunks from your docs, store embeddings in a VECTOR column, and build an HNSW index; compare recall and latency before and after switching to IVF on the same corpus.
Turn on SELECT AI SHOWSQL for the answer step so reviewers can sign off on generated SQL before execution.

References

Oracle AI Vector Search overview
VECTOR type, TO_VECTOR, and VECTOR_DISTANCE
CREATE VECTOR INDEX (HNSW/IVF)
Querying with similarity and hybrid predicates
DBMS_VECTOR and DBMS_VECTOR_CHAIN
Select AI examples (SHOWSQL, EXPLAINSQL, object scoping, data access)
Security guide landing (VPD, Unified Auditing, Data Redaction, TDE)

Vector‑native RAG on Oracle: embeddings, HNSW/IVF, and hybrid search under database governance

Key Takeaways

Prerequisites

What “vector‑native” means in Oracle

From document to chunk to embedding

DBMS_HYBRID_VECTOR: keyword + semantic in one call

Choosing an index: HNSW or IVF

Writing hybrid retrieval you can defend

A tiny end‑to‑end demo

Orchestrating the answer: in your app or with Select AI

Operating in production shape

Version scope and environment notes

Out of scope

Try it next

References

Mark Nelson

Architect

Safer NL2SQL with Select AI and AI Profiles

How to Enforce Role-Based Data Access in AI Applications with Oracle Deep Data Security

Vector‑native RAG on Oracle: embeddings, HNSW/IVF, and hybrid search under database governance

Key Takeaways

Prerequisites

What “vector‑native” means in Oracle

From document to chunk to embedding

DBMS_HYBRID_VECTOR: keyword + semantic in one call

Choosing an index: HNSW or IVF

Writing hybrid retrieval you can defend

A tiny end‑to‑end demo

Orchestrating the answer: in your app or with Select AI

Operating in production shape

Version scope and environment notes

Out of scope

Try it next

References

Authors

Mark Nelson

Architect

Safer NL2SQL with Select AI and AI Profiles

How to Enforce Role-Based Data Access in AI Applications with Oracle Deep Data Security