Deploying and managing generative AI workloads, especially for mission-critical production applications, isn’t a trivial task. As we talk with customers deploying AI workloads on Oracle Cloud Infrastructure (OCI), one challenge keeps coming up: the skills needed to deploy, manage, and scale them effectively. Running AI at scale means understanding GPU configurations, hardware choices, software stacks, AI models, and the right tools for observability and management. It can be resource-, time-, and cost-intensive, but it’s not something businesses can afford to overlook. Staying competitive in today’s market depends on it.

Today, we’re introducing OCI AI Blueprints, an AI workload Kubernetes management platform with  a set of blueprints which can help you deploy, scale, and monitor AI workloads in production in minutes. OCI AI Blueprints are OCI-authored and verified application and infrastructure resources deployment package for your common GenAI workloads. They include clear hardware recommendations set by default for NVIDIA accelerated computing infrastructure for most popular AI software applications. This includes blueprints built with inference engines such as NVIDIA open-source vLLM Dynamo, TensorRT™, TensorRT-LLM, and PyTorch, microservices such as  NVIDIA NIM  and NeMo Retriever as part of Nvidia AI Enterprise Software (NVAIE); and NVIDIA Blueprints .

Getting an ML application launched in GPU is first step, but managing the infrastructure dependencies efficiently when your experiments scale is also very critical. OCI AI Blueprints come with end-to-end plumbed observability, GPU scale controller, multi-node GPU, cluster management single-pane management tools to enable you to deploy AI workloads without having to make software stack configuration decisions and infrastructure dependency decisions separately. This approach reduces the lead time of AI proof of concept to production from weeks to minutes.

Benefits of OCI AI Blueprints

With OCI AI Blueprints you can: 

  • Confidently deploy your AI workload deployments: Get OCI-verified best practices to run AI workloads to the NVIDIA accelerated computing platform with prepackaged compute, storage, and networking configurations and ML applications built for AI-specific scenarios. 
  • Simplify infrastructure automation and compute efficiency for mission-critical production workloads: Automatically provision infrastructure components like GPU node pools, storage, and load balancers, latency-driven autoscaling with KEDA, distributed inference of large language models (LLMs) across multiple nodes and multi-instance GPUs. All these concepts are abstracted for you to configure quickly and deploy workloads with these capabilities in minutes, without having to mess around with YAML files or kubectl.
  • Ease AI monitoring and observability concerns with simple integrations with most popular open-source observability tools: Automate observability and management tasks with popular open-source software packages, including NVIDIA DCGM Exporter. Necessary add-ons like Prometheus, Grafana, and MLFlow are also automatically installed by AI Blueprints. 

A choice of open-source ML libraries blueprints as well as licensed Nvidia AI Enterprise Software (NVAIE) components combined with OCI best practices to run your NVAIE workloads on OCI Nvidia GPUs. 
OCI AI Blueprints interface

 

How to Use OCI AI Blueprints

Starting with OCI AI Blueprints on GitHub, install the OCI AI Blueprints toolkit in a local tenancy and access the blueprint’s UI and API. You can then choose, deploy, and monitor an AI Blueprint to meet your business needs.  

At launch, the following list of blueprints are available. We are rapidly adding new blueprints including NVAIE Blueprints based on NVIDIA technology in the future releases:

Available Blueprints:

  • LLM inference of most popular LLM models from Hugging Face
  • LLM Model fine-tuning with your custom dataset on Nvidia GPUs.
  • LLM Benchmarking on Nvidia GPUs for ML training exercise.

Upcoming Blueprints:

  • Nvidia GPU Complete Health Check and Tensor Performance Blueprint
  • Nvidia AI Blueprints PDF to Podcast using NIM with OCI scalable storage and Kubernetes scale controlle

 

Get started with OCI AI Blueprints today

We encourage you to try out OCI AI Blueprints for yourself in your own OCI tenancy. Our detailed readme document in GitHub repo will help you navigate OCI deployment. We have packaged all the dependencies needed for you to run a Generative AI workload efficiently in your OCI tenancy. With AI Blueprints as a Gen AI workload platform, you can extend its capacities to build your own blueprints to help achieve consistent and repeatable deployments, management and adopt industry best ML Ops practices. Get started today on the AI Solutions Hub.

And while you’re here, come see us at booth #1515 at NVIDIA GTC!