NVIDIA GPU Cloud Containers on Oracle Cloud Infrastructure
This week at the NVIDIA GPU Technology Conference in Munich, the Oracle Cloud Infrastructure team is happy to announce general availability of support for NVIDIA GPU Cloud (NGC) containers. You can read about this and the other exciting Oracle and NVIDIA news in the press release.
With this new capability, you can now easily run the GPU-accelerated containers from NGC on the best price-performance cloud.
“AI is a strategic imperative for every industry. With the availability of Tesla V100 in Oracle Cloud Infrastructure, researchers and developers can tap into the world’s fastest accelerators to fuel faster discoveries and insights,” said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. “The integration of NVIDIA GPU Cloud’s software containers optimized to fully leverage the Tesla V100 will ensure that enterprises around the world can access the technology they need to accelerate their AI research and deliver powerful new AI products and services.”
Today's installations of AI and High Performance Computing (HPC) applications are often complicated, almost always relying on libraries that need to be installed using specific versions. It's often that an application's performance or even proper operation depends on the correct dependencies.
Software downloaded from Linux package managers like yum or apt is not always up-to-date and probably not built with performance in mind. Sometimes, the software is not available in a packaged form and needs to be built from source, which is a time consuming process that requires additional libraries and dependencies.
While portability is important for system administrators, others like domain scientists, researchers, and engineers are looking for computational reproducibility.
Containers are a way to package applications, libraries, and configurations and run them as a self-contained and isolated environment agnostic to software installed on the host system. Since applications inside a container always use the same environment, the performance is reproducible and portable.
NVIDIA GPU Cloud offers a container registry of Docker images for deep learning software, HPC applications, and HPC visualization tools. Containers are pre-built, optimized to take full advantage of GPUs, and ready to run on Oracle Cloud Infrastructure. There are over 35 containers in the repository, including GPU-accelerated deep learning frameworks, molecular dynamics (NAMD, GROMACS, LAMMPS), and visualization tools like ParaView with NVIDIA IndeX.
The Oracle-NGC-Deep-Learning-Image contains everything needed to run NGC containers on Oracle Cloud Infrastructure using compatible Bare Metal or Virtual Machine shapes (BM.GPU2.2, BM.GPU3.8, VM.GPU2.1, VM.GPU3.1, VM.GPU3.2, VM.GPU3.4).
To use NGC containers on Oracle Cloud Infrastructure, log into the Oracle Cloud Infrastructure Console, configure the settings as needed, and then create an instance based on the Oracle-NGC-Deep-Learning-Image by specifying the image OCID. After launching the instance, you can SSH into the instance and pull your desired container from the NGC container registry.
To access all of the containers available from the NGC container registry, you will need to authenticate to NGC. To do this, sign-up for an NGC account at no charge, and create an NGC API key on the NGC Website.
Once you’ve signed up, on the NGC Registry page, click Get API Key, then Generate API Key, and then Confirm to generate the key. If you have an existing API key, it becomes invalid once you generate a new key.
Open the Console, see Signing In to the Console for steps on how to do this.
Click Compute, choose a compartment you have permission to work in, and then click Create Instance.
In the Create Instance dialog box, specify the instance name, and select the availability domain for the instance.
Click Change Image Source.
On the Oracle Images tab, check NVIDIA GPU Cloud Machine Image, review and accept the Agreement for Oracle App "NVIDIA GPU Cloud Machine Image", and then click Select Image.
Select Virtual Machine or Bare Metal Machine for Instance Type.
Select the shape you want to use for Shape.
For SSH Keys, click Choose SSH Key File, navigate to the location where you saved the public key portion (.pub) of the SSH key file you created, select the file and click Open.
In the Configure Networking section, select the virtual cloud network (VCN) compartment, VCN, subnet compartment, and subnet.
Click Create Instance.
Oracle Cloud Infrastructure provides a Command Line Interface (CLI) you can use to complete tasks. For more information, see Quickstart and Configuration. Use the launch command to create an instance, specifying image for sourceType and the image OCID
ocid1.image.oc1..aaaaaaaaknl6phck7e3iuii4r4axpwhenw5qtnnsk3tqppajdjzb5nhoma3q in InstanceSourceDetails for LaunchInstanceDetails.
You should now see the NGC instance with the status of Provisioning. Once the status has changed to Running, you can connect to the server.
For general information about launching Compute instances, see Creating an Instance.
Since the image is Ubuntu 16.04 LTS, username for connection is: ubuntu
After connecting to the instance, you are greeted with the following message:
Enter your NGC API key and press ENTER.
Logging into the NGC Registry at nvcr.io.....Login Succeeded
You're ready to use NGC containers, let's validate our installation using the command:
nvidia-docker run nvcr.io/nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04 nvidia-smi
If you can see the previous output, everything is working fine. The Pull complete lines mean that layers of the Docker container have been downloaded. This allows for reusability and efficiency when creating new images. Learn more on the Docker website.