Developing AI applications with OCI Generative AI and LangChain

February 7, 2024 | 5 minute read
Rave Harpaz
Architect/Applied Scientist of AI - Oracle Cloud Infrastructure
Arthur Cheng
Senior Member of Technical Staff - Oracle Cloud Infrastructure
Text Size 100%:

Unlocking new opportunities with LangChain and OCI Generative AI

LangChain is the fastest growing open source framework for developing applications powered by large language models (LLMs). Designed with flexible code abstractions that simplify development, and packed with a comprehensive library of tools, chat utilities, data interfaces, and agent support, LangChain is redefining AI application development. 

Oracle Cloud Infrastructure (OCI) Generative AI is a fully managed service that provides a set of state-of-the-art, customizable LLMs that cover a wide range of use cases. Using the OCI Generative AI service, you can access ready-to-use, pretrained models or create and host your own fine-tuned custom models on dedicated AI clusters.

In collaboration with LangChain, OCI Generative AI now provides organizations the ability to harness OCI's Generative AI service in combination with LangChain, a partnership that unlocks new opportunities by leveraging the capabilities of both platforms.

This blog post provides a guide on the essential ingredients needed to set up and build applications powered by OCI Generative AI and LangChain. It assumes you have basic familiarity with LangChain and you have set up the required permissions and Identity and Access Management (IAM) policies to consume OCI services.

Prerequisites

You need the latest versions of LangChain and the OCI software developer kit (SDK). To install and upgrade these two Python packages, use the following command:

pip install -U langchain oci

Consuming models offered by OCI Generative AI through LangChain only requires you to initialize an interface with your OCI endpoint, model ID, OCID, and authentication method. From that point onward, you can use the models offered by OCI Generative AI in the same manner and with the same level of abstraction as any other models supported by LangChain. The following section describes the main components and attributes of this interface.

OCI Generative AI endpoints

All models hosted in OCI Generative AI are currently accessible through a single API endpoint,  https://inference.generativeai.us-chicago-1.oci.oraclecloud.com. You can consume the models in the following ways:

  • On-demand:  Access pretrained LLMs and embedding models.
  • Dedicated AI clusters: Deploy pretrained LLMs or fine-tuned custom models exclusive to your tenancy.

Authentication

The authentication methods supported for LangChain are equivalent to those used with other OCI services and follow the standard SDK authentication methods, specifically API Key, session token, instance principal, and resource principal.

Model IDs

Each model hosted in OCI Generative AI is associated with a unique ID. LangChain currently supports the following on-demand pretrained models:

  • cohere.command
  • cohere.command-light
  • meta.llama-2-70b-chat
  • cohere.embed-english-v3.0
  • cohere.embed-english-light-v3.0
  • cohere.embed-multilingual-v3.0
  • cohere.embed-multilingual-light-v3.0

To access models hosted in your dedicated AI cluster, create an endpoint whose assigned OCID (currently prefixed by ‘ocid1.generativeaiendpoint.oc1.us-chicago-1’) is used as your model ID in LangChain.

Usage examples

Interface instantiation and basic LLM calls

The first example demonstrates how to invoke an OCI Generative AI LLM. To work with OCI Generative AI, import the OCI LLM interface, called OCIGenAI, from the LangChain library. In this example, we initialize the interface with the default authentication method (API key), the cohere.command model, and the OCI Generative AI endpoint.


from langchain_community.llms import OCIGenAI

llm = OCIGenAI(
    model_id="cohere.command",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="MY_OCID", # replace with your OCID
)

response = llm.invoke("Tell me one fact about earth", temperature=0.7)
print(response)

The following example progresses to the use of LangChain’s extended library of features, demonstrates how to pass model parameters, and shows how to use a different OCI authentication method (session token). Initialize model parameters, such as temperature, top_k, and max_tokens, with the OCIGenAI interface by passing a model_kwargs dictionary. In a limited number of scenarios, you can also pass model parameters through LangChain’s invoke methods, as shown in the previous example.


from langchain.chains import LLMChain
from langchain_core.prompts import PromptTemplate

llm = OCIGenAI(
    model_id="meta.llama-2-70b-chat", 
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="MY_OCID",
    auth_type="SECURITY_TOKEN", 
    auth_profile="MY_PROFILE" # replace with your profile name
    model_kwargs={"temperature": 0.7, "top_p": 0.75, "max_tokens": 200}
)

prompt = PromptTemplate(input_variables=["query"], template="{query}")
llm_chain = LLMChain(llm=llm, prompt=prompt)
response = llm_chain.invoke("what is the capital of france?")
print(response["text"])

 

The RAG pattern

The final example demonstrates the use of OCI Generative AI embedding models together with a Cohere fine-tuned custom LLM to implement a RAG pattern. Access embedding models through a separate interface, called OCIGenAIEmbeddings, which has similar attributes to the LLM interface used in the previous examples. To access fine-tuned models, you must also pass the provider (currently ‘cohere’ or ‘meta’) as an argument to the LLM interface.


from langchain_community.embeddings import OCIGenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chains import RetrievalQA

embeddings = OCIGenAIEmbeddings(
    model_id="cohere.embed-english-light-v3.0",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="MY_OCID",
)

vectorstore = FAISS.from_texts(
    [
        "Larry Ellison co-founded Oracle Corporation in 1977 with Bob Miner and Ed Oates.",
        "Oracle Corporation is an American multinational computer technology company headquartered in Austin, Texas, United States.",
    ],
    embedding=embeddings,
)

retriever = vectorstore.as_retriever()

rag_prompt_template = """Answer the question based only on the following context:
{context}
Question: {question}
"""

rag_prompt = PromptTemplate.from_template(rag_prompt_template)

llm = OCIGenAI(
    model_id="ocid1.generativeaiendpoint.oc1.us-chicago-1.xxxxxxx", # replace with your custom model OCID
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="MY_OCID",
    provider="cohere"
)

rag = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    chain_type_kwargs={"prompt": rag_prompt,},
)

print(rag.invoke("when was oracle founded?"))
print(rag.invoke("where is oracle headquartered?"))

Conclusion

The fusion of LangChain and OCI Generative AI provides access to leading AI technologies that enable the development of innovative AI-powered business applications. This blog post provides a guide with examples of utilizing OCI Generative AI in combination with LangChain.

We invite you to try OCI Generative AI or our other Oracle AI products. If you’re new to Oracle Cloud Infrastructure, try Oracle Cloud Free Trial, a free 30-day trial with US$300 in credits.

For more information, see the following resources:

Rave Harpaz

Architect/Applied Scientist of AI - Oracle Cloud Infrastructure

Arthur Cheng

Senior Member of Technical Staff - Oracle Cloud Infrastructure


Previous Post

Guide to prompt engineering: Translating natural language to SQL with Llama 2

Anshuk Pal Chaudhuri | 8 min read

Next Post


Navigating the frontier: Key considerations for developing a generative AI integration strategy for the enterprise

Jyotika Singh | 6 min read