Oracle Cloud Infrastructure was recently ranked the 7th-fastest performer on the IO500, a widely recognized high-performance storage IO benchmark. The vast majority of other systems on the list are specialized on-premises environments, including research supercomputers. We achieved 500 GB/s write IO throughput and 13.1 million IOPS in metadata performance with a high-performance parallel file system using 270 nodes in an HPC cluster network running the BeeGFS BeeOND file system.
To give this performance a sense of scale, Oracle's customer Zoom moves 7 PB of data each day to support 300 million meeting participants. The storage cluster used here could transfer and process that much data locally in less than 4 hours. As part of a computational fluid dynamics (CFD) cluster, we estimate this much storage throughput would support 1000 simultaneous users for typically sized jobs.
Oracle's second-generation cloud infrastructure provides an HPC bare metal compute shape (BM.HPC2.36) with 100-Gbps RDMA Over Converged Ethernet (RoCEv2) and 6.4-TB of local NVMe SSD. A 100-Gbps clustered network is instrumental for delivering hundreds of gigabytes per second IO throughput for a file system. Each node in the cluster ran metadata, storage, and the client service, while the management service ran on the first node.
RoCEv2 is a key feature of our HPC cluster network offering on Oracle Cloud Infrastructure. Oracle Cloud customers can run their file system’s server-client block communication on the fast and reliable 100-Gbps RDMA infrastructure to get a significant performance boost for reduced latency and increased throughput.
Let's take a more detailed look at the benchmark results
The IO500 ranked list for high-performance storage systems was published recently as part of the ISC High Performance 2020 event. The following table and graphs provide insights generated from the published list.
Figure 1: IO500 Benchmark Results
Figure 2: Overall Bandwidth Scores
Figure 3: Bandwidth per Process Scores
Figure 4: Top Six BeeGFS File Servers
Figure 5: Top Four Spectrum Scale File Servers
Deploy the BeeGFS BeeOND architecture using RDMA on Oracle Cloud Infrastructure by using our Terraform template. Within a few minutes, the following architecture is deployed in your tenancy.
Figure 6: BeeGFS BeeOND RDMA Cluster Architecture
Oracle Cloud’s high-performance, multiple-instance-attachment block storage combined with HPC compute instances has set a new standard for file systems in the cloud. An HPC file system that provides on-premises performance and cloud prices can be deployed or scaled in minutes. If you want an HPC file system that provides over 13 million IOPS and 500 GB per second, then check out Oracle Cloud’s HPC File System on the Cloud Marketplace and the BeeGFS ON Demand (BeeOND) Quickstart deployment script.
Let us know how fast you make your file system!