Over many years, Oracle Database has proven its versatility by not only supporting many of the latest data types and workloads including JSON, graph, spatial, in-memory, blockchain, and vector, but also by leveraging the best, state-of-the-art hardware. Today, we leverage cutting-edge CPU technologies in our Engineered Systems and Cloud. But what about GPUs? How do they fit with Oracle Database?

At Oracle CloudWorld 2024 in Las Vegas, we are demonstrating two GPU-accelerated capabilities for Oracle Database that utilize NVIDIA GPUs to accelerate AI Vector Search functionality in Oracle Database 23ai.

The first capability is the GPU-accelerated creation of vector embeddings from a variety of different input data sets such as text, images, videos, etc. Vector embedding creation is a necessary first step in AI Vector Search, and this capability helps allow efficient bulk vectorization of large volumes of input data.

The second is an early-stage proof of concept that illustrates how GPUs can be used to accelerate vector index creation and maintenance within Oracle Database. Vector index creation is the second step in AI vector search, and the fast creation and maintenance of vector indexes is essential for supporting high-volume AI vector workloads.

Both capabilities illustrate how, in addition to running the entire AI Vector Search pipeline on CPUs, GPUs can work in synergy with CPUs to accelerate computationally intensive portions of vector search workloads, showcasing the potential for performance improvements in these areas.

“We are always looking for emerging software and hardware technologies that can benefit Oracle Database users,” said Juan Loaiza, Executive Vice President of Mission-Critical Database Technologies, Oracle. “The evolution of vector search and the use of GPUs to accelerate compute-intensive portions of the workflow show how these technologies can work in concert to help meet the needs of our customers. Working with NVIDIA GPUs has allowed us to demonstrate performance improvements for new AI Vector Search capabilities at Oracle CloudWorld.”

“NVIDIA AI tools give Oracle database users the flexibility to use GPUs wherever they add the most value. Embedding generation and vector search index creation are two important tasks that are required by vector databases, such as the Oracle database,” said Michael Kagan, NVIDIA CTO. “With GPU-accelerated AI vector search, Oracle database users can significantly improve the performance of their AI pipelines.”

Accelerating Oracle Database performance for Vector Search capabilities

GPUs provide exceptional memory bandwidth and computational power for specialized compute-intensive use cases. They do not replace the general-purpose CPUs used for traditional database workloads like OLTP or traditional analytics. However, they can help speed up operations involving “dense computations,” including common AI and machine learning (ML) operations that involve multi-pass computations on the same memory-resident data.

AI Vector Search in Oracle Database 23ai enables intelligent search for unstructured as well as structured business data by using AI techniques. It includes the ability to run imported ONNX models using CPUs inside Oracle Database, a new VECTOR datatype to store vector embeddings, new VECTOR indexes for approximate nearest neighbor (ANN) search, and new SQL operators for native search capabilities within Oracle Database. We have highly optimized vector search workloads by implementing SIMD-optimized vector distance kernels and state-of-the-art approximate search indexes which leverage the full memory bandwidth and multi-core parallelism of modern-day CPU platforms such as Exadata Storage Servers. Even after maximizing the benefits of CPU-based processing, certain aspects of vector workloads, including generating vector embeddings and creating vector indexes, can benefit from GPU acceleration.

Generating vector embedding for Oracle Database 23ai using OCI’s NVIDIA GPU-accelerated instances

Vector embeddings are a mathematical representation of complex content, such as images, text, or complex business objects, and are generated by deep learning models called “embedding models.” In Oracle Database, vectors are created by running embedding models against data that can be resident either within or external to the database with the models being run inside the database or external to it. The resulting vectors are then stored within VECTOR columns within the database.

In our first demonstration, which will be generally available at Oracle CloudWorld, we are showing how integrated access to NVIDIA GPUs through Oracle Machine Learning (OML) Notebooks in Oracle Autonomous Database – Serverless can be used to generate vector embeddings. This capability lets users leverage the Python interpreter in OML Notebooks—an integral feature of Autonomous Database—to load data from a database table into the GPU VM supporting the notebook Python interpreter, generate vectors embeddings on the GPU instance, and store those vectors in the Oracle Database where they can be searched using AI Vector Search. Provisioning of the GPU instance is done automatically for users, and data is transferred between the database and the GPU-accelerated VM using functions from Oracle Machine Learning for Python.

Generating Oracle Database 23ai VECTOR indexes on GPUs

Vector indexes play a crucial role in approximate search of vector data. Constructing these indexes is compute-intensive and time consuming. Furthermore, vector indexes have to be maintained and periodically refreshed (or even repopulated) as data gets updated or new data is loaded.

In this proof-of-concept demonstration at Oracle CloudWorld, we are showing the integration of Oracle Database 23ai with NVIDIA GPUs for the creation of an in-memory graph index type known as HNSW (Hierarchical Navigable Small World). The demonstration highlights how a new, Oracle-developed compute-offloading framework enables Oracle Database to transparently delegate complex vector index creation tasks to external servers equipped with powerful GPUs—maintaining simplicity while delivering enhanced performance.

When a user sends a request to create an HNSW vector index to the Oracle Database, the database intelligently compiles and redirects the task, along with the necessary vector data, to a new Compute Offload Server process. This server process leverages the computational power of GPUs and employs the CAGRA algorithm, part of the RAPIDS cuVS library, to rapidly generate a graph index. Once the CAGRA index is created, it is automatically and efficiently transferred back to the Oracle Database instance where it is converted into an HNSW graph index. The final HNSW index is then readily available for subsequent vector search operations within the Oracle Database instance.

 

 

This innovative approach combines the versatile converged data management capabilities of Oracle Database with the raw computational power of GPUs. By seamlessly integrating GPU acceleration into the database and offloading the computationally intensive task of index creation to specialized hardware, we can achieve remarkable improvement in processing speed and efficiency, while maintaining the ease of use and reliability that Oracle Database users expect.

Conclusion

These two GPU-accelerated demonstrations showcase the potential for using GPUs to process compute-intensive portions of the AI Vector Search pipeline. Stay tuned for more exciting developments from this collaboration between Oracle and NVIDIA.

For more information on how to get started, visit the AI Vector Search web page, read the AI Vector Search User’s Guide, visit the AI Solutions Hub, and try Oracle Database 23ai for free.