Oracle GoldenGate 23ai and Oracle Database 23ai Vectors

May 2, 2024 | 5 minute read
Alex Lima
Director of Product Management
Text Size 100%:

Oracle Database 23ai introduces the ability to perform AI Vector searches directly in the database and Oracle GoldenGate 23ai can provide real-time, bi-directional replication of this new vector functionality. 

Oracle GoldenGate 23ai (now available on-premises and from fully-managed OCI GoldenGate service) enables heterogeneous data integration and high availability across cloud data stores. Oracle GoldenGate 23ai delivers new capabilities that allow vectors to be replicated in real-time across heterogeneous vector stores. It also allows enterprises to quickly bring AI to all their data with minimal risk by replicating data from their existing databases to Oracle Database 23ai where it can be vectorized and indexed for fast AI search.

This blog discusses how Oracle GoldenGate 23ai can help enrich the new Oracle Vector Search features in Oracle Database 23ai.

Oracle AI Vector Search is an innovative feature that seamlessly integrates AI vector search capabilities into the robust Oracle Database platform. This powerful tool empowers users to perform semantic searches on unstructured data, unlocking valuable insights and enhancing business decision-making.  By bridging the gap between vector data and relational business data, Oracle Vector Search eliminates the need for separate vector databases, reducing complexity of an additional database engine and enhancing data security.
 

Key Features of Oracle Vector Search

  • VECTOR Data Type: Oracle Vector Search introduces the VECTOR data type, which allows users to store vector embeddings directly within Oracle Database tables, alongside their business data. This unified storage simplifies data management and enables sophisticated queries.
  • Vector Embedding Utilities: Oracle Vector Search provides a comprehensive set of utilities, including SQL functions and PL/SQL packages, to generate vector embeddings from unstructured data. Users can choose from a variety of pretrained models or import their own embedding models to suit their specific requirements.
  • Efficient Similarity Searches: Oracle Vector Search features two primary vector indexing methods:
    • Neighbor Partition Vector Index: Partition-based index that clusters vectors based on similarity, ensuring efficient scale-out and fast transactional support.
    • Neighbor Graph Vector Index: In-memory index that represents vectors as vertices and similarities as edges, providing highly accurate and speedy search results.
  • SQL Operator Extensions: Oracle Vector Search extends SQL with intuitive operators that effortlessly combine relational and semantic search within the same query. These extensions empower users to perform complex queries involving joins, groupings, and aggregations across vector data and relational data.
  • Retrieval Augmented Generation (RAG): Oracle Vector Search leverages the power of pre-trained Large Language Models (LLMs) to augment business data with highly accurate responses. This enables end-users to receive more precise answers to their questions by leveraging both private data sources and general knowledge.

How to Create Vector Embeddings

To create vector embeddings in Oracle Database, you can use a number of different methods, including:

  • Using SQL functions and PL/SQL packages provided by Oracle AI Vector Search
  • Importing pretrained ONNX embedding models
  • Generating vector embeddings outside of Oracle Database and then importing them into the database

Using SQL Functions and PL/SQL Packages

Oracle AI Vector Search provides a number of SQL functions and PL/SQL packages that you can use to create vector embeddings. These functions and packages can be used to create vector embeddings from a variety of data types, including text, images, and audio.

For example, the following SQL function can be used to create a vector embedding from a text string:

SELECT 
  TO_VECTOR(VECTOR_EMBEDDING(doc_model USING 'Hello world' AS data)) AS embedding;

This function will return a vector embedding that represents the meaning of the text string 'Hello world'. The embedding can then be used to perform similarity searches on other text strings.

Oracle GoldenGate 23ai

Oracle GoldenGate 23ai provides support for Oracle Vectors, including the ability to replicate vectors between databases without downtime, map database functions directly from the Replicat, and support various AI components in databases. This enables use cases such as distributed AI processing, support for multiple vector datatypes, active-active/multi-cloud configurations, data consolidation, and streaming changes to search indexes. 

OGG and Vector HUB

Figure: create a real-time vector hub for GenAI

In light of that, here are some examples of how Oracle GoldenGate (OGG) can facilitate the migration of vectors to Oracle Database 23ai:

  • Supports Oracle Database 23ai vectors: Oracle GoldenGate supports the new Oracle Database 23ai vectors and allows for full replication of them as long as the dimension and embedding model are the same.
  • Replicate base data instead of vectors: If the source and target databases use different embedding algorithms or have different vector dimensions, Oracle GoldenGate can replicate the actual data used to generate the vectors instead of the vectors themselves.
  • Database function mapping directly from the Replicat: Oracle GoldenGate has added the ability to do database function mapping directly from the Replicat, eliminating the need for SQLEXECs. 
  • Create real-time vector hubs: Oracle GoldenGate facilitates the creation of real-time vector hubs, allowing for efficient access and processing of vectors.
  • Migrate vectors without downtime: Oracle GoldenGate can migrate vectors into Oracle Database 23ai without causing any downtime, ensuring a smooth transition between on-premises environments and the cloud, or between different cloud environments.
  • Enable consolidation, multi-cloud, and active-active scenarios: Oracle GoldenGate supports consolidation, multi-cloud, and active-active scenarios, allowing for flexibility and scalability in vector management.
  • Actionable AI/ML from streaming pipelines: Oracle GoldenGate Stream Analytics integrations with Oracle Machine Learning (OML) and ONNX (Open Neural Network Exchange) enables actionable AI/ML from streaming pipelines, allowing for the seamless integration of vectors into AI and ML processes.
  • Stream changes to Elasticsearch or OpenSearch compatible search indexes: Oracle GoldenGate can stream changes in vectors to Elasticsearch or OpenSearch compatible search indexes, broadening the scope of vector utilization.
  • GoldenGate provides support for PostgreSQL vectors (pgvector), enabling replication and migration of vectors from Postgres databases. This includes support for pgvector versions 0.5.0 and higher. 


Conclusion

Oracle AI Vector Search is a game-changing tool that enables businesses to unlock the power of generative AI without sacrificing data security or operational efficiency. Oracle GoldenGate’s seamless integration with the Oracle Database 23ai and Oracle AI Vector Search empowers our customers to create game-changing real-time vector hubs to power AI solutions with timely, trusted, and transactional business data. 
 

Additional Information:

GoldenGate Documentation

GoldenGate Home Page

Oracle GoldenGate YouTube Channel

Alex Lima

Director of Product Management

Alex is Director of Product Management in the GoldenGate group focusing on GoldenGate Core Product, GoldenGate for Oracle, and Goldengate Foundation Suite.  Alex has a 20-year extensive background in managing, integrating, and architecting Oracle database solutions to customers worldwide, specializing in process improvement, performance, tunning, and high availability.


Previous Post

Connecting to OCI GoldenGate Data Streams API Endpoints

Deniz Sendil | 5 min read

Next Post


Part 1(4) : Hands-On Guide with Security: Provisioning GoldenGate/Database in a Private Subnet and Access via Bastion Service

Ravi Gupta | 4 min read