Oracle and NVIDIA Announce NVIDIA HGX-2 for Oracle CIoud Infrastructure & Collaboration on RAPIDS Accelerated Data Science Software

Karan Batta
Product Management

From enabling autonomous vehicles to global climate simulations, rapid progress in AI and HPC has transformed entire industries, while demanding massive increases in complexity and compute power. As part of this transition over the last 12 months, Oracle Cloud Infrastructure has been collaborating with NVIDIA to unlock cutting-edge bare-metal and virtual machine instances for engineers, data scientists, researchers and developers. This collaboration gives them the power to run and solve the greatest AI and HPC challenges all at their fingertips. Oracle Cloud Infrastructure was the first public cloud provider to launch bare-metal NVIDIA Pascal GPU architecture-based instances in 2017 and then followed that up with another public cloud-first with general availability of NVIDIA’s Tesla V100 Tensor Core GPUs bare-metal instances, which help make deep learning workloads even faster.

Today, in collaboration with NVIDIA, we’re excited to announce that Oracle Cloud Infrastructure will bring the NVIDIA HGX-2 platform to Oracle Cloud Infrastructure in both bare-metal and virtual machine instances, giving customers access to a unified HPC and AI computing architecture. HGX-2 is designed for multi-precision computing —high precision FP64 and FP32 for accurate HPC, and faster, reduced precision FP16 and INT8 for AI. The former is ideal for HPC applications. Combined with 2 petaFLOPS of compute and NVIDIA NVSwitch interconnect technology providing 300 GB/sec of GPU-to-GPU bandwidth, HGX-2 has the capability to accelerate the most demanding applications.

“This new collaboration with Oracle will help fuel incredible innovation across a wide range of industries and uses,” said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. “By taking advantage of NVIDIA’s latest technologies, Oracle is well positioned to meet surges in demand for GPU acceleration for deep learning, high-performance computing, data analytics and machine learning.”

These instances on Oracle Cloud Infrastructure will also include up to 48 cores of Intel’s Xeon processors running at 3.5GHz all-core turbo frequency along with up to 768Gb of system memory and ability to get up to 25Gbps of non-oversubscribed bandwidth along with the ability to attach up to 1 Petabyte of NVMe Block Storage device. Some of the instances that Oracle Cloud Infrastructure will be offering in early 2019 include the following, we will also be offering a 16-way instance:

Instance Cores (3.5 Ghz all-core turbo) Memory Storage GPUs
BM.GPU4.8 48 768GB 1 Petabyte of Block Storage 8x 32GB V100 with NVSwitch
VM.GPU4.4 22 360GB 1 Petabyte of Block Storage 4x 32GB V100 with NVSwitch
VM.GPU4.2 11 180GB 1 Petabyte of Block Storage 2x 32 V100 with NVSwitch
VM.GPU4.1 5 90GB 1 Petabyte of Block Storage 1x 32 V100


Apart from enabling HPC and AI workloads, we’re targeting data science and analytics as a major area of investment. This is bolstered by recent acquisitions and work on Oracle's Data Science Cloud, which makes it easy and intuitive for data science teams to work collaboratively on the data-driven projects that transform how companies do business. We are enabling use-cases such as bringing algorithmic decision-making to drug development, diagnostics, and clinical trials with a data science platform or bringing algorithmic decision-making to lending, investing, and banking in the fintech sector.

NVIDIA RAPIDS Software Framework

Hence, we’re excited to collaborate with NVIDIA to support newly announced RAPIDS open source software from NVIDIA, a set of open source libraries for accelerating end-to-end data science training pipelines on NVIDIA GPUs. RAPIDS dramatically speed up the data science pipeline by moving workflows onto the GPU, optimizes machine learning training with more iterations for better model accuracy and accelerates the Python data science toolchain with hassle-free integration and minimal code changes.

Support for NGC Containers Now Generally Available on Oracle Cloud Infrastructure

You can download a variety of GPU-accelerated containers from the NVIDIA GPU Cloud container registry and run them on Oracle Cloud Infrastructure! First announced with preview support at NVIDIA’s GPU Technology Conference in Silicon Valley, general availability means that now everyone can easily deploy containerized applications and frameworks from NGC for HPC, data science and AI and run them seamlessly on Oracle Cloud Infrastructure while taking advantage of the portfolio of GPU instances across multiple regions in the U.S. and Europe. Find out how to use NGC containers on Oracle Cloud Infrastructure here. For more information about Oracle Cloud Infrastructure’s GPU offerings, visit – https://cloud.oracle.com/iaas/gpu.

Oracle Cloud Infrastructure at GTC Europe 2018

The Big Compute & HPC teams will be at NVIDIA’s GTC Europe in Munich in full force; so, I encourage you to come and speak to our engineering teams, get demos and hands-on experience with Oracle Cloud Infrastructure. Additionally, attend our general session titled – E8528 - AI & HPC Infrastructure on Oracle Cloud Infrastructure” on Thursday 11th October at 16:30 in Room 22. 


See you there!

