The authors want to thank Dr. Sanjay Basu, Senior Director of OCI Engineering, and Rob Dolin, Senior Manager of OCI Engineering, for their assistance in publishing these results.
Oracle Cloud Infrastructure (OCI) has achieved strong results across multiple benchmarks in the MLCommons Inference Datacenter v3.1 suite, the industry standard for measuring AI infrastructure performance. OCI was tested across several shapes powered by NVIDIA GPUs, including the NVIDIA H100 Tensor Core GPU, the NVIDIA A100 Tensor Core GPU, and the NVIDIA A10 Tensor Core GPU, and has the following key highlights:
From its inception, OCI has been focused on providing high-performance infrastructure for any workload. We were one of the first cloud providers to offer support for bare metal instances natively and high-performance RDMA network for internode communications. We can support all phases of the AI workflow, including training and inference. Many organizations are driving innovation on OCI with AI and NVIDIA GPUs, including MosaicML, Twist Bioscience, and Emory University.
This announcement is for our inaugural publication of OCI’s MLPerf results through the MLCommons Inference Datacenter v3.1 benchmarks. MLCommons is a collaborative engineering organization focused on developing the AI ecosystem through benchmarks, public datasets, and research. OCI’s portfolio of GPU shapes include NVIDIA H100, NVIDIA A100, and NVIDIA A10 GPUs, among others. We have obtained inference benchmark results for all three of these industry-leading NVIDIA GPU shapes.
The BM.GPU.H100.8 shape includes eight NVIDIA H100 GPUs, with 80 GB of GPU memory per GPU. The CPU processors are Intel Xeon Platinum 8480+ with 2-TB system memory and 112 cores. The shape also includes 16 local NVMe drives with a capacity of 3.84 TB each.
The following table shows the benchmark results of BM.GPU.H100.8. The results are comparable or superior to those from alternatives. For the complete results, see the ML Commons Inference Data Center page.
|
Scenario |
|
Benchmark |
Server (Queries/sec) |
Offline (Samples/sec) |
RESNET |
584,197.00 |
703,548.00 |
RetinaNet |
12,884.60 |
14,047.20 |
3D U-Net 99 |
- |
51.45 |
3D U-Net 99.9 |
- |
51.48 |
BERT 99 |
56,022.10 |
70,689.90 |
BERT 99.9 |
49,617.50 |
62,285.50 |
DLRM v2 99 |
300,033.00 |
339,265.00 |
DLRM v2 99.9 |
300,033.00 |
339,050.00 |
GPT-J 99 |
79.90 |
106.69 |
This shape includes eight NVIDIA A100 GPUs, with 80 GB of GPU memory per GPU. The CPUs are AMD EPYC 7J13 64-Core Processors with 128 cores and 2 TB of system memory. The benchmark results indicated that the results were on par with, if not superior to, the other hyperscalers.
|
Scenario |
|
Benchmark |
Server (Queries/sec) |
Offline (Samples/sec) |
RESNET |
290,028.00 |
325,567.00 |
RetinaNet |
5603.34 |
6512.98 |
3D U-Net 99 |
- |
30.32 |
3D U-Net 99.9 |
- |
30.33 |
RNNT |
104,012.00 |
107,408.00 |
BERT 99 |
25,406.20 |
28.028.60 |
BERT 99.9 |
12,824.10 |
14,534.40 |
DLRM v2 99 |
80,018.10 |
138,331.00 |
DLRM v2 99.9 |
80,018.10 |
138,179.00 |
GPT-J 99 |
16.92 |
27.13 |
GPT-J 99.9 |
17.04 |
25.29 |
The BM.GPU.A10.4 shape includes four NVIDIA A10 GPUs, with 24 GB of GPU memory and 1-TB system memory. The shape also includes two 3.5-TB NVMe local storage.
The benchmark identified that OCI Compute instances based on NVIDIA A10 GPUs are a suitable option for inferencing specific models at optimum price-performance. GPTJ and DLRMv2 benchmarks were not run on A10.4 for this iteration.
The complete results are available at the ML Commons Inference Data Center page.
|
Scenario |
|
Benchmark |
Server (Queries/sec) |
Offline (Samples/sec) |
RetinaNet |
855.00 |
953.53 |
3D U-Net 99 |
- |
5.15 |
RNNT |
9,202.52 |
16,989.30 |
Oracle Cloud Infrastructure provides a comprehensive portfolio of GPU options optimized for AI workloads including training and inference. The MLPerf Inference results showcase OCI’s competitive strength in AI infrastructure and ability to handle demanding workloads like large language models. For further information on our products, see our GPU and AI infrastructure pages.
Sesh is a Distinguished Cloud Architect. His passion is in leveraging technology to enable business outcome especially in the areas of MLOps and Cloud Native solutions. He has an MBA from University of Minnesota and a bachelor's in engineering.
Akshai is a Director of Product Marketing for Oracle Cloud Infrastructure (OCI) focused on driving adoption of OCI’s services and solutions. He is a graduate of UC Berkeley and Georgia Tech. When not working, he enjoys keeping up with the latest in technology and business.
Ruzhu is a master principal cloud architect in OCI AIML could engineering team with strong hands-on expertise in large AI/ML platform and application optimization. He has 20+ years’ experience in Life Science application development, enablement, and user support as previously IBM Lead Scientist and SME in Life Science global team. He holds a PhD in microbiology (molecular biology focus) and a master in computer science.
Previous Post