Shatter the Million IOPS Barrier in the Cloud with OCI Block Storage

May 30, 2024 | 7 minute read
Max Verun
Senior Principal Product Manager
Text Size 100%:

1.3M IOPS

We are pleased to announce that you can now achieve up to an aggregate 1.3 million I/O operations per second (IOPS) and up to 12 GB per second throughput per OCI Compute instance with the OCI Block Volume Service. In addition, you can now also attach up to 32 Ultra High Performance (UHP) volumes to a single compute instance. This service update is particularly useful for high performance I/O workloads, such as Artificial Intelligence/Machine Learning (AI/ML) with intensive data processing and generative AI, 3D modeling and simulation, and demanding blockchain processing. In addition to modern workloads, traditional applications such as online transaction processing, data warehousing and analytics, databases with large data sets and extensive complex processing also benefit from this level of performance.

We continue to offer a single, simple and all NVMe SSD based volume type that is only available from OCI, versus multiple complex disjoint tiers from other cloud providers. You can easily scale I/O performance of your attached block storage any time with a simple dynamic performance slider without the overhead of provisioning additional volumes or compute instances.

1.3 million IOPS per instance with block storage provides you 63% increase over our prior industry-leading 800,000 IOPS limit without any change in OCI storage pricing. This performance update is generally available to use for all your existing and new volumes without additional cost. We continue to guarantee performance with the financially backed service level agreement (SLA). You can configure the maximum IOPS per volume with the now familiar Volume Performance Units per GB (VPU/GB) using the dynamic performance slider. This setting configures volume performance on demand, and you can continue to adjust it dynamically for optimal performance without detaching, migrating, or otherwise disturbing the attached volumes on your compute instances.

How to Get This Level of Performance

This update lifts the limit of attaching 1 UHP volume per compute instance. Single volume performance remains unchanged: All bare metal instances continue to achieve UHP performance of maximum 300,000 IOPS per volume at sub-millisecond latency. Virtual machines with iSCSI attachments achieve up to 300,000 IOPS per volume, and with paravirtualized attachments achieve up to 150,000 IOPS per volume.

You need to attach multiple volumes to a compute instance to get an aggregate of 1.3 million IOPS block storage performance.

Below is an example storage benchmark fio performance run on a BM.Standard.E4.128 compute instance on one of our regions. By attaching 5 UHP volumes to the compute instance, it shows that you can achieve 1.3 million IOPS or more on the instance. This is predictable and sustained, not burst performance. Compute shapes evolve at a fast rate with newer and faster memory, CPU and network. Now your block storage also scales and aligns with high performance compute shapes for your demanding applications in the cloud.

  • 5 x 2TB volumes, each with 120 VPU/GB performance setting that guarantees 300K IOPS per volume. They are all attached to the same BM.Standard.E4.128 compute instance.

5x 32TB 300K IOPS UHP Volumes

 

  • Edit Volume Console page shows that each volume is easily configured for 120 VPU/GB that guarantees 300,000 IOPS per volume using the performance slider.

Edit Volume - Default 120 VPU per GB, 300K IOPS per Volume

  • Storage performance benchmark tool fio is run on this compute instance across all 5 volumes using random read and write I/O. It shows the predictable and steady performance for this example, with an even distribution of 750K read and 750K write IOPS totaling to 1.5 million IOPS.

FIO Benchmark Run

OCI Block Volumes performance page provides more detail and how to achieve this level of performance. This depends on memory and network bandwidth configurations of compute shapes. Not all compute shapes can reach up to 1.3 million IOPS for remote block storage.

Block Storage Performance per Instance on Bare Metal (BM) Shapes

Shape OCPU Memory (GB) Max Network Bandwidth Max IOPS per Instance (up to) Max Throughput per Instance (Block Volume)
BM.Standard.E5.192 192 2304 1 x 100 Gbps 1,300,000 12 GB/s
BM.Standard.E4.128 128 2048 2 x 50 Gbps 1,300,000 6 GB/s
BM.DenseIO.E4.128 128 2048 2 x 50 Gbps 1,300,000 6 GB/s
BM.Standard3.64 64 1024 2 x 50 Gbps 1,300,000 6 GB/s
BM.Optimized3.36 36 512

2 x 50 Gbps

1 x 100 Gbps RDMA

1,300,000 6 GB/s
BM.GPU.A100-v2.8 128 640

2 x 50 Gbps

16 x 100 Gbps RDMA

1,300,000 6 GB/s
BM.GPU.A10.4 64 96 2 x 50 Gbps 1,300,000 6 GB/s
BM.GPU4.8 64

GPU: 320 GB

CPU: 2048 GB

1 x 50 Gbps

8 x 200 Gbps RDMA

1,300,000 6 GB/s
BM.Standard.A1.160 160 2048 2 x 50 Gbps 800,000 6 GB/s
BM.GPU3.8 52

GPU: 128 GB

CPU: 768 GB

2 x 50 Gbps 625,000 3 GB/s
BM.GPU2.2 28

GPU: 32 GB

CPU: 192 GB

2 x 25 Gbps 625,000 3 GB/s

 

Block Storage Performance per Instance on Virtual Machine (VM) Shapes

Shape OCPU Memory (GB) Max Network Bandwidth Max IOPS per Instance Max Throughput per Instance (Block Volume)
VM.Standard.E5.Flex 1 OCPU minimum, 90 OCPU maximum 1 GB minimum, 1049 GB maximum 1 Gbps per OCPU, maximum 40 Gbps 20,000 * max network bandwidth in Gbps (up to 600,000)
 
120 MB/s * max network bandwidth in Gbps (up to 4,800 MB/s)
VM.Standard.E4.Flex 1 OCPU minimum, 64 OCPU maximum 1 GB minimum, 1024 GB maximum 1 Gbps per OCPU, maximum 40 Gbps
VM.DenseIO.E4.Flex 8 / 16 / 32 128 / 256 / 512 8 Gbps / 16 Gbps / 32 Gbps
VM.Standard3.Flex 1 OCPU minimum, 32 OCPU maximum 1 GB minimum, 512 GB maximum 1 Gbps per OCPU, maximum 32 Gbps
VM.Optimized3.Flex 1 OCPU minimum, 18 OCPU maximum 1 GB minimum, 256 GB maximum 4 Gbps per OCPU, maximum 40 Gbps
VM.Standard.A1.Flex 1 OCPU minimum, 80 OCPU maximum 1 GB minimum, 1024 GB maximum 1 Gbps per OCPU, maximum 40 Gbps
VM.GPU.A10.2 30

GPU: 48 GB

CPU: 480 GB

48 Gbps 600,000 5,760 MB/s

 

Heinz Mielimonka, Customer Success Director and Cloud Architect in Oracle provides additional insight in his blog post  "Unimagined Disk Speed for Your Servers". He gives a comparison of block storage performance levels across largest cloud providers. He also provides tooling to try and prove it to yourself that you can reach this level of performance on OCI.

Try it for yourself

We want you to experience this level of cloud storage performance and all the enterprise-grade capabilities that OCI offers. It’s easy to try them out with OCI Free Tier. For more information on taking advantage of these performance updates, see the Block Volume service overview, Ultra High Performance (UHP) volumes, Block Volume performance, dynamic performance scaling with auto-tuning, OCI storage pricing.

Max Verun

Senior Principal Product Manager


Previous Post

Deploying an HPC cluster with RDMA network on OCI OKE and File Storage service mount

Ainura Djumagulov | 7 min read

Next Post


Run:ai on OCI: Cloud-native approach to maximizing GPU utilization and accelerating AI workloads

Sanjay Basu PhD | 6 min read
Oracle Chatbot
Disconnected