Oracle Cloud Infrastructure has continued to build momentum in high-performance computing (HPC) since our last update in November. It starts with helping customers solve their challenges faster and more cost effectively, often in ways that were previously not possible in the cloud.
Several large manufacturers, including Nissan, have moved to an all-digital workflow on Oracle Cloud Infrastructure for product design, engineering, and testing. Running these workloads in our cloud enables companies to rapidly deploy HPC infrastructure wherever it’s needed in the workflow, while only paying for exactly what they consume.
Until recently, the performance and latency requirements (two microseconds or less) of many digital engineering applications were not achievable in the cloud. Oracle Cloud Infrastructure broke this boundary in 2018 by introducing cloud-based cluster networking and HPC compute instances. These offerings unlocked the ability to run message passing interface (MPI) applications like Computational Fluid Dynamics (CFD) in the cloud. This year, we’ve automated the deployment of Oracle Cloud cluster networks to a few clicks. We’ve also continued to expand our ecosystem with partners like Convergent Science, who validated that Oracle Cloud Infrastructure’s performance scaling is nearly ideal up to 4,000 compute cores.
Figure 1: CONVERGE 3.0 scaling on Oracle Cloud Infrastructure for a combusting turbulent partially premixed flame (Sandia Flame D) simulation.
While big product news like Oracle’s announcement of NVIDIA A100-based instances often capture the spotlight, the application of such technologies is often more intriguing. Over the past six months, we’ve released several GPU-enabled machine images in the Oracle Marketplace to simplify the deployment of Data Science virtual machine environments, GPU-enabled workstations (with Visual Studio), DNA sequencing capabilities (with NVIDIA and Parabricks), and more.
These machine images and Oracle Cloud Infrastructure’s comprehensive portfolio of HPC solutions are now addressing use cases, including drug discovery at top pharmaceutical companies, environmental protection with drone image analysis at the San Francisco Estuary Institute, animation rendering at Brigham Young University, and carbon emission reduction with 3D microtomography analysis at Royal Holloway University.
High-performance computing and machine learning can involve terabytes and petabytes of data, processed as billions of small files or much larger high-resolution images and videos. These types of workloads require a new level of storage capability and performance in the cloud. HPC file systems need to be able to handle thousands of parallel streams of data, gigabytes per second of aggregate throughput, and tens of millions of aggregate inputs and outputs (IOs) per second.
In 2019, we were proud to be one of only two public clouds in the list of the top 20 fastest HPC storage environments in the world put out by the Virtual Institute for IO. We used an IBM Spectrum Scale on Oracle Cloud Infrastructure solution for that result. In less than a year, we’ve blown past that benchmark, demonstrating seven times the performance at 140 GB/s with BeeGFS.
Now, we offer a range of HPC file system options, from Oracle Block Volume-based solutions that cost only $0.05/GB per month at massive scale to higher-performance clustered node solutions using our HPC Compute instances.
In addition to making it easier to launch Oracle Cloud Infrastructure’s HPC solutions through machine images and automation stacks, we’ve also released an HPC solutions certification and associated free content to teach you the fundamentals.
There’s no better time to start improving your HPC results. See the following resources for our key HPC solutions, images, and stacks: