In 2020, we released the virtual machine (VM) image for Data Science and Machine Learning (ML) services on Oracle Cloud Infrastructure (OCI) to use NVIDIA GPUs to speed up AI application development. Read the blog announcing the release and background information.
Since then, we have released three updates to address security vulnerabilities, targeting NVIDIA P100 and V100 Tensor Core GPUs. Before these updates, features mismatched between this OCI VM image and the OCI Data Science service. It also had no support for NVIDIA Container Runtime. This fiscal year, we have addressed these gaps and released the following new Data Science images on OCI:
-
Oracle Linux Image with NVIDIA GPU drivers
-
Oracle-Linux-8.6-Gen2-GPU-2022.05.31-0
-
Image family: Oracle Linux 8.x
-
Operating system: Oracle Linux
-
Release date: May 31, 2022
-
-
Ubuntu Linux image with NVIDIA GPU drivers
-
Image family: Ubuntu 20.04
-
Operating system: Ubuntu
-
Release date: May 31, 2022
-
Oracle Linux image for x86 (supporting both Intel and AMD)
-
Image family: Oracle Linux 8.x
-
Operating system: Oracle Linux
-
Release date: Aug. 10, 2022
Essential features
The images have several popular tools preinstalled for modeling, development, training, and inference.
Operating system, drivers, and other base components:
-
Operating system of your choice: Oracle Linux and Ubuntu Linux
-
NVIDIA drivers, CUDA Toolkit, cuDNN library (when GPU machines are used)
-
Docker, NVIDIA Container Runtime, NVIDIA Container Toolkit
-
Anaconda (“conda”)
-
Git
Authoring tools:
-
Visual Studio Code
-
PyCharm Community Edition
-
Jupyter and JupyterLab
ML framework:
-
PyTorch, TensorFlow, MXnet, scikit-learn
-
pyspark
-
dask
Other notable components:
-
OCI command line interface (CLI)
-
OCI software developer kit (SDK) for Python
-
OCI Accelerated Data Science (ADS) kit (including MLX and AutoML)
Data Science notebook:
-
Data Science Conda environments for x86, including General ML. Install them without the conda zip files on the drive.
-
Basic Python conda environment installed on the base image of the notebook session
-
Example notebooks
-
JupyterLab v2.X
-
Oracle Data Science CLI (Requires setup)
-
ADS CLI
The images have preconfigured conda environments that include NVIDIA GPU drivers, CUDA toolkit, cuDNN drivers, common Python and R integrated development environments (IDEs), Jupyter notebooks, and open source ML and deep learning frameworks. For high-performance shared file systems, BeeGFS client is also preconfigured in the current image.
Computational power, ease-of-use, productivity, and efficiency are at the forefront of why this solution is attractive to data scientists. It allows the users to run machine learning models across all OCI’s GPU-accelerated machines, including machines built on NVIDIA A100 Tensor Core GPUs and NVIDIA A10 Tensor Core GPUs. Oracle’s powerful automation capabilities enable the deployment of hundreds of instances for model experimentation and testing through a single click. No more requests for provisioning. No more maintenance issues or refreshes. Just straight deployment that’s ready-to-go with the right open-source frameworks installed, upgraded, and tested.
How to deploy the images
You can deploy the images from the Oracle Cloud Console or Oracle Cloud Marketplace. You can also provision the images using OCI CLI or SDK, called from stand-alone Terraform script, or through Oracle resource manager: NVIDIA GPU image and X86 image.


Overall solution
You can access the instances through Remote Desktop, SSH, or browser such as Jupyter Hub. You have full access to the instances. If needed, you can adjust configurations and install other frameworks like with any other VM. Maintenance and protection against vulnerabilities of provisioned Data Science VMs is the customer’s responsibility.
The solution includes a reference architecture that deploys a bastion host, training node, inference node, user application VM, and other components on OCI. It uses a region with one availability domain and regional subnets, and the same architecture can be used in a region with multiple availability domains. It even runs on Oracle Linux 7.X GPU version. If using Oracle Ksplice, real-time upgrades can occur without shutting down instances.
Security and data privacy are of utmost concern. OCI VMs for Data Science and AI include a security-first design built for the enterprise. Data sets are encrypted, and data privacy is protected through data minimization and transparency. In other words, Oracle has no insight into data on its cloud and is transparent about where this data is processed and stored.
Finally, data scientists can expand their Compute resources by using autoscaling or stopping the Compute instance when it’s not needed to control costs. The VM even includes basic sample data and code for testing and exploring.
Try for yourself
To learn more about our GPU shapes and try them out, visit the NVIDIA and Oracle Cloud Infrastructure NVIDIA GPU Cloud Platform. To learn more about Oracle’s data science solutions, visit the OCI Data Science page, and follow us on Twitter.
Get your free account for Oracle Cloud Infrastructure and get started today!
