This is part – 3 of our blog series: “Generative AI and its use cases for enterprise applications”. In our previous blog article, we discussed about the AI Knowledge Assistant use case. We also discussed the solution architecture of building such a solution and the steps to implement this solution. We subscribed to the OCI Gen Ai service and created all the necessary resources we need to build this solution.  In this blog we will continue with the implementation steps and share some resources that you can follow to do this on your own.

 

Steps continued:

Step – 10 : Upload documents to object storage buckets (Documentation)
 

                                                                                  Figure 1: Upload documents to Object Storage bucket

                       .                                                                        .     Figure 1: Upload documents to Object Storage bucket

Step – 11 : Building a vector search application

Steps : 

  • Create necessary users, database objects and grant privileges.
  • Configure Access Control List and Credentials to access Object Storage and OCI generative AI from Autonomous Database.
  • Create tables to store documents, chunks, vectors, prompts and conversation history.  (Detail steps in the live labs link available in the references section.)

                                                                                    ​​​Figure 2: Creating users and credential to access documents

                       .                                                                        .     Figure 2: Creating users and credential to access documents

  • Load documents from object storage

Please note that loading documents into the database is optional. Alternatively, If you don’t like to load data into the database, you can use the Gen AI Agent Service to point to the data source or you can create a vector index in the oracle database 23ai and point it to the object storage bucket which will create data pipeline in the background to embed data directly and can be periodically run based on the refresh rate configure in the pipeline.

                                                                      Figure 3: Load documents to Object Storage bucket

                       .                                                                       Figure 3: Load documents from Object Storage bucket

  • Create chunks and embed them using OCI Generative AI service

                                                                      Figure 4: Chunk the documents before embedding

                       .                                                                        Figure 4: Chunk the documents before embedding

                                                                       Figure 5: Creating vectors from chunks

                       .                                                                        Figure 5: Creating vectors from chunks

  • Execute a vector search query. Vectorize the query and run a similarity search against the vectors we stored in 23ai.

                                                                      Figure 6: Executing a vector search query

                       .                                                                      Figure 6: Executing a vector search query

Now, we have our vector search building blocks ready. Next Step is to build a user interface that simplifies the end-to-end process of vector search. We can use any application development technology to build this. In our case, we will be using Apex that is provided out of the box with autonomous database to build a simple chat interface to view existing agreements, upload new ones and ask questions against them.

Step 12 : Front end chat interface using Oracle APEX.

The hands-on labs provides a sample Apex App that you can use and enhance based on your requirement. For the purpose of this blog, we will not cover building the APEX app from scratch. You can check existing agreements embedded in 23ai from the Agreements section under the Menu. Let’s add more agreements using the UI by clicking on the Update Agreements section and selecting the create button.

                                                                      Figure 7: Viewing existing agreements

                       .                                                                        .     Figure 7: Viewing existing agreements

                                                                      Figure 8: Adding new agreements

                       .                                                                        .     Figure 8: Adding new agreements

Alright! we have now added the documents into the data set. Let’s see how we can run a vector search query.

Step – 13 : Query data from documents in Natural Language

Click on the Search Agreements section from the Menu then enter your query in the Prompt box at the bottom of the screen and hit the send button.

Question 1  : Let’s try to summarize the purchase agreement 79917676

                                                                      Figure 9: Summarize agreement

                       .                                                                        .     Figure 9: Summarize agreement

Now, let’s find out the total amount in USD of all items purchased in agreement 79917676.

Question 2 : what is the total amount in USD of all items purchased in agreement 79917676

                                                                       Figure 10: Total amount in agreement

                       .                                                                        .     Figure 10: Total amount in agreement

Awesome! We have now a Retrieval Augmented Generation (RAG) solution using which we can interact and ask questions against the documents.

Now, an important aspect of this solution we implemented here is that the end-to-end flow of the vector search application is implemented by 3 main components:

  1. Object Store to store agreements
  2. Oracle Database 23ai to embed the documents and store them.
  3. OCI Generative AI Service to embed documents and execute vector search queries.

No other tools and/or technologies were needed to build this application. Oracle Database 23ai provides all the API’s out of the box to load, chunk and embed the documents. This simplifies the architecture and significantly reduces the complexity and learning curve thereby allowing organizations to leverage their existing Oracle investment without any extra cost and quickly generate value for their business.

References:

1. Oracle Database 23ai – Vector Search Documentation

2. Oracle Database 23ai – hands-on-lab by Steve Nichols