Oracle Generative AI documentation | LiteLLM OCI provider docs
Why this matters
One OpenAI-compatible gateway can route to every model family hosted on Oracle Generative AI Infrastructure. OCI Signature v1 signing is handled inside LiteLLM, including Instance Principal and OKE Workload Identity paths.Teams get production controls such as virtual keys, budgets, routing, fallbacks, caching, guardrails, audit logging, and cost tracking.
LiteLLM now treats Oracle Generative AI Infrastructure as a first-class provider. Developers can route requests to Meta Llama, xAI Grok, Cohere Command, Cohere Embed, Google Gemini, and OpenAI gpt-5 through a single OpenAI-compatible endpoint, while LiteLLM handles OCI Signature v1 request signing for every supported authentication path.
That matters because modern AI systems rarely use only one model. A production agent may call a fast model for routing, a long-context model for retrieval, a reasoning model for planning, a vision model for document understanding, and an embedding model for memory. Without a gateway, each model family can bring a different SDK, authentication scheme, request shape, and rate-limit policy.
LiteLLM removes that complexity. Applications call the familiar OpenAI Chat Completions or embeddings interface. The gateway resolves credentials, chooses the right vendor adapter, transforms the request into the shape Oracle Generative AI Infrastructure expects, signs it, and normalizes the response before it returns to the application. Cohere-specific fields, generic model formats, reasoning controls, and streaming response buffering stay behind the gateway boundary.

Figure 1. LiteLLM sits between the application and the OCI tenancy, exposing one OpenAI-compatible API while forwarding signed requests to Oracle Generative AI Infrastructure.
What changed
The new provider guide and implementation bring the integration to parity with other major cloud inference platforms. Previous support focused on an early community contribution for Cohere chat with manual request signing. The current work covers proxy configuration, tool calling, vision input, reasoning parameters, environment-based authentication, and the current OCI model catalog.
All OCI-hosted models are addressable as oci/<model-name>. Application code does not need to branch for Cohere versus generic model families, and existing tools that already speak the OpenAI API can target a LiteLLM proxy with minimal or no code changes.
Example: call OCI through the LiteLLM SDK
from litellm import completion
response = completion(
model="oci/xai.<grok-chat-model>",
messages=[{"role": "user", "content": "What's the weather in Boston?"}],
tools=[{
"type": "function",
"function": {"name": "get_current_weather"},
}],
tool_choice="auto",
)
print(response.choices[0].message.tool_calls)

Figure 2. A single request flows from an OpenAI-shaped call, through credential resolution, request transformation, OCI Signature v1 signing, and normalized response handling.
Enterprise gateway capabilities
For production teams, the gateway is useful because it centralizes the controls that customers would otherwise need to build for each application:
- Virtual API keys with per-key budgets, RPM and TPM limits, model allowlists, expiry dates, and team or user attribution.
- Cost tracking with request-level attribution to key, team, user, model, or tag.
- Routing and fallback across OCI regions or across providers on rate-limit or 5xx errors.
- Caching through in-memory, Redis, S3, and Qdrant back ends in semantic or exact-match modes.
- Guardrails and audit logging that apply uniformly across all providers, including Oracle Generative AI Infrastructure.
Deployment note: LiteLLM can be deployed entirely within a customer-managed OCI environment, helping organizations keep prompts, credentials, and application data within their tenancy boundaries.
| Capability | Coverage |
| Chat, synchronous and streaming | All vendor families |
| Function and tool calling | Cohere plus generic model families |
| Vision and multimodal input | Meta Llama vision variants, Cohere Command vision variants, Google Gemini 2.5 |
| Reasoning controls | Google Gemini 2.5, OpenAI gpt-5, and latest xAI Grok reasoning variants |
| Embeddings | Cohere Embed, single and batch requests up to 96 documents |
| Authentication | Manual credentials, OCI_* environment variables, OCI SDK Signer, Instance Principal, and OKE Workload Identity |
Example: run a LiteLLM proxy in front of OCI
# config.yaml
model_list:
- model_name: oci-grok
litellm_params:
model: oci/xai.<grok-chat-model>
oci_region: os.environ/OCI_REGION
oci_user: os.environ/OCI_USER
oci_fingerprint: os.environ/OCI_FINGERPRINT
oci_tenancy: os.environ/OCI_TENANCY
oci_key_file: os.environ/OCI_KEY_FILE
oci_compartment_id: os.environ/OCI_COMPARTMENT_ID
litellm --config config.yaml
The Agentic Layer
LiteLLM gives applications one contract for every model on Oracle Generative AI Infrastructure. The natural next layer is the OpenAI Agents SDK, OpenAI’s open-source framework for building agentic applications. Agents SDK agents can plan, call tools, hand off work to other agents, enforce guardrails, and stream events back to a UI.
With LiteLLM in front, the Agents SDK can use its built-in OpenAI-compatible model class. The gateway holds the OCI signing credentials and enforces platform controls, while the agent carries only a virtual key issued by the gateway. That keeps model governance, cost attribution, and identity management in one place.
Example: OpenAI on top of the LiteLLM AI Gateway
from agents import Agent, OpenAIChatCompletionsModel, Runner, set_tracing_disabled
from openai import AsyncOpenAI
set_tracing_disabled(True) # tracing would need an OpenAI platform key
client = AsyncOpenAI(
api_key="<virtual-key>", # key issued by the gateway
base_url="http://litellm-gateway:4000",
)
agent = Agent(
name="Research assistant",
instructions="You are a concise research assistant.",
model=OpenAIChatCompletionsModel(model="oci-cohere-command", openai_client=client),
)
result = Runner.run_sync(agent, "Summarise the latest news on ...")
print(result.final_output)

Figure 3. The OpenAI Agents SDK consumes the LiteLLM AI Gateway through its OpenAI-compatible model class, while the gateway carries budget, routing, observability, guardrail, and signing responsibilities.
What customers can build
- Multi-model agents that keep planning, tool execution, memory, and vision inside the same OCI tenancy and compartment.
- OpenAI-compatible applications that can be repointed to Oracle Generative AI Infrastructure without an SDK swap.
- Document and image pipelines that use the same image_url content block already supported by OpenAI-compatible vision APIs.
- Hybrid routing setups where LiteLLM fails over from Oracle Generative AI Infrastructure to another provider, or vice versa, without application code changes.
Conclusion and CTA
This release turns LiteLLM into a practical enterprise gateway for Oracle Generative AI Infrastructure. Together with the OpenAI Agents SDK, the combination helps Oracle customers move from a few API calls to governed, observed, multi-tenant agent systems with the routing, spend, caching, guardrail, and audit surface required for production AI.
Next steps
- Configure the gateway: LiteLLM OCI provider docs; install litellm, then set the OCI_* variables or attach an OCI signer.
- Validate current OCI details: Oracle Generative AI documentation; confirm model identifiers, serving options, and retirement dates before publishing examples.
- Build agentic deployments: https://github.com/openai/openai-agents-python / OpenAI Agents SDK documentation / Agents SDK on GitHub; use the Agents SDK when you need governed, multi-agent workflows on top of the gateway.
