Additional contributors: Miao Yu, Product Management, OCI; Akshai Parthasarathy, Product Marketing, OCI
Today at SC23, we’re announcing our upcoming plans to offer Oracle Cloud Infrastructure (OCI) Compute instances powered by the NVIDIA GH200 Grace Hopper Superchip. The GH200 consists of an Arm CPU (Grace) linked to an NVIDIA H100 Tensor Core GPU (Hopper) with a high-bandwidth memory space of 576 GB. The NVIDIA GH200 uses NVIDIA NVLink Chip-2-Chip (C2C) interconnectivity running at 900 GB/s, 7x higher bandwidth than the standard PCIe Gen5 lanes found in traditional accelerated systems, providing much more throughput to address the most demanding generative AI and HPC applications.
The NVIDIA GH200 is planned to be deployed on OCI Compute bare-metal instances, which provide the isolation and control of physical servers with the flexibility and ease of cloud operations. Bare-metal instances provide customers with direct access to the underlying hardware without a hypervisor, which eliminates issues related to “noisy neighbors” and provides maximum isolation for high-performance and latency-sensitive workloads. The OCI family of bare-metal instances includes Standard shapes for a wide range of use cases, Dense I/O shapes for large databases and big data workloads, HPC and Optimized shapes for high-performance computing, and GPU shapes for hardware-accelerated workloads.
The upcoming OCI BM.GPU.GH200 instance is a bare-metal GPU shape designed to support the most intense AI inference workloads that need fast, high-capacity memory for large language models (LLMs) and recommender systems. Other workloads that can use these instances include vector databases, scientific and high-performance computing, graph neural networks (GNNs) and single-instance LLM inference. Compared to NVIDIA H100, each NVIDIA GH200 GPU provides up to 9x AI training performance, 2.3x LLM inference performance, and 1.7x HPC performance.
Instance Name | BM.GPU.GH200 |
CPU | 1x Grace CPU 72c Armv9 |
GPU | 1x NVIDIA H100 96-GB HBM3 |
Cache coherent memory | 480-GB LPDDRX5 |
Memory bandwidth | 512 GB/s |
CPU-GPU connectivity | NVLINK C2C (900 GB/s) |
Storage | 2.88-TB NVMe local disk |
Networking | 1x 100 Gbps |
"Uber has achieved great results with OCI Compute and NVIDIA GPUs for Michelangelo, our machine learning platform. With the upcoming instances based on NVIDIA Grace Hopper, we aim to get even more value for similar workloads." –Kamran Zargahi, Senior Director of Technology Strategy, Uber
"Cohere is leveraging OCI Compute GPU instances to bring the power of generative AI to Oracle. We expect to create and serve LLMs faster and more economically with these new instances based on NVIDIA GH200." –Martin Kon, President and COO, Cohere
If you’re interested in OCI Compute instances with the NVIDIA GH200 Grace Hopper Superchip, start a conversation and learn more about AI infrastructure today. Visit us at booth #937 at SC23, the supercomputing conference, and attend our other events at the conference.
Disclaimer:
The preceding announcement is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation.
Sagar leads compute product management for OCI and is focused on delivering flexible, scalable, and highly performant infrastructure. Prior to OCI, he was part of leading compute product at NVIDIA and AWS.
Previous Post