OCI Generative AI continues to expand how developers and enterprises work with foundation models. Today, we’re announcing support for importing open-weights model NVIDIA Nemotron 3 Super, enabled by a new Model Import capability in OCI Generative AI. Soon to be available on Oracle Government Cloud in addition to commercial cloud regions, Nemotron 3 Super is the first model from NVIDIA available through OCI Generative AI Model Import and demonstrates how organizations can now run advanced reasoning models on OCI while maintaining control over customization and deployment. 

Model Import allows customers to bring supported models into OCI Generative AI and run them through the same managed service used for Oracle-hosted models. This combines the flexibility of open models with a consistent API, enterprise security model, and operational experience. 

Oracle Government Cloud operates government cloud regions in the US, UK, and Australia and provides governments worldwide with a way to run generative AI models that still meet local data residency, classification, operational, and security requirements. 

Announcing Support for NVIDIA Nemotron 3 Super

OCI Generative AI now supports importing open-weights NVIDIA Nemotron 3 Super models.

Nemotron 3 Super is designed for high-performance reasoning, strong instruction following, and enterprise generative AI workloads, and is part of the broader Nemotron 3 reasoning family that also includes Nemotron Nano and Ultra models for different deployment and capability needs. As described by NVIDIA, “NVIDIA Nemotron Super is a hybrid MoE model with the highest compute efficiency and accuracy for multi-agent applications and for specialized agentic AI systems.” Nemotron 3 Super is positioned toward the top of the openness-intelligence spectrum, combining open weights with documented data sources and techniques, and delivering strong benchmark performance and efficiency.

This makes it particularly well-suited for emerging enterprise AI patterns where multiple agents collaborate to complete complex tasks. These systems increasingly rely on models that can coordinate reasoning, planning, and execution efficiently across workflows.

For organizations already building on NVIDIA accelerated computing, Nemotron 3 Super integrates naturally into existing AI pipelines while benefiting from OCI’s unique cloud infrastructure.

Typical use cases include:

  • Enterprise copilots requiring consistent reasoning
  • Multi-agent collaboration and agentic AI workflows
  • Advanced retrieval-augmented generation (RAG) pipelines
  • Domain-specific assistants in legal, financial, engineering, or scientific environments
  • Organizations standardizing on NVIDIA’s AI ecosystem across training and inference

OCI Generative AI provides the managed platform layer, while Nemotron 3 Super delivers the model capabilities required for advanced workloads, especially for enterprises looking for a high degree of openness across weights, data and training techniques, backed by NVIDIA’s committed roadmap for the Nemotron 3 family.

Model Import: Bringing Your Own Models into OCI Generative AI

Model Import introduces a new level of flexibility by allowing teams to run their own models inside OCI Generative AI while maintaining a unified operational environment.

You can import large language models from Hugging Face and OCI Object Storage buckets into OCI Generative AI, create endpoints for those models, and use them in the Generative AI service, allowing your applications to interact with them using the same APIs and workflows as managed models. This allows organizations to standardize on preferred open-source or research models without introducing additional platforms or operational complexity.

Figure 1: New Import Models feature in OCI Generative AI console

Imported models support common generative AI capabilities such as:

  • Text-to-text generation
  • Image-and-text to text workflows
  • Embeddings generation
  • Reranking models for retrieval pipelines

Because imported and managed models share the same service interface, applications can evolve over time without architectural changes.

The import workflow follows a simple pattern:

  1. Select a supported model from Hugging Face or prepare model artifacts in OCI Object Storage.
  2. Import the model into OCI Generative AI.
  3. Deploy it on a Dedicated AI Cluster .
  4. Create an endpoint and access the model through the OCI Generative AI API, SDK, or Playground.

For technical details and supported configurations, see the Supported Models for Import.

Enabling Agentic and Multi-Agent Architectures

Model Import becomes particularly valuable as organizations adopt agentic AI architectures, where multiple models collaborate to complete complex workflows.

In these systems, different agents often specialize in planning, reasoning, retrieval, or execution. Rather than relying on a single model, teams select models based on the role they perform within the system.

OCI Generative AI allows these models to run within a single managed environment, making it easier to orchestrate agent interactions while maintaining consistent security and operational controls.

This is where Nemotron 3 Super fits naturally. Its hybrid MoE architecture and strong reasoning performance make it well-suited for coordination and reasoning roles within multi-agent systems, helping improve collaboration between agents while maintaining efficient inference performance.

Model Import in Context

As OCI Generative AI continues to expand, it becomes increasingly important to distinguish how different models are delivered and operated within the service. OCI Generative AI supports multiple model types, each designed to address different customer requirements while maintaining a consistent developer experience.

Today, customers can work with three model categories:

  • External Models — models accessed through OCI Generative AI but hosted externally, such as Gemini or Grok. These provide access to leading models through a unified API, allowing customers to leverage innovation across the ecosystem.
  • Service Hosted Models — models hosted and operated directly within OCI Generative AI, including models from Cohere, Meta Llama, and OpenAI OSS. These models provide a fully managed experience on OCI, allowing teams to focus on application development without managing infrastructure.
  • Model Import (Bring Your Own Model) — models imported and operated within OCI Generative AI by customers themselves, such as NVIDIA Nemotron 3 Super. This option provides greater control over model selection, customization, and lifecycle while keeping the same platform experience.

Model Import complements the existing model options by extending flexibility rather than replacing them. Teams can start with Service Hosted or External Models to move quickly, and later introduce imported models when specific customization, governance, or standardization requirements emerge.

By supporting multiple model types within a single service, OCI Generative AI allows organizations to adopt new models as innovation evolves while maintaining a consistent operational and security framework.

From Day One to Deep Customization — on One Platform

Whether you are starting with generative AI or building highly customized production systems, OCI Generative AI evolves with your needs.

Start with fully managed models, adopt new models, or import and fine-tune your own — all within a single enterprise-grade service.

In addition, Nemotron 3 Super NIM will be available soon on OCI Console and Data Science AI Quick Actions.

Learn More