Executive Summary

Oracle Cloud Infrastructure File Storage with Lustre offers a compelling solution for IT decision makers who need high-performance file storage for critical AI, machine learning (ML), and high-performance computing (HPC) workloads. This blog focuses on showcasing the performance capabilities of File Storage with Lustre through rigorous synthetic benchmarks; while also highlighting the unique advantages it offers to customers.  By running industry-standard benchmarking tools, we demonstrate File Storage with Lustre’s ability to deliver consistent, high throughput, scalable IOPS, and strong metadata performance. These attributes are critical for accelerating AI and ML projects, enabling faster training cycles, and supporting large-scale data analytics. This post is part of a two-part series, with a detailed follow-up guide Running Benchmarks on OCI File Storage with Lustre which provides step-by-step instructions and results for replicating these benchmarks.

 

Why File Storage with Lustre? A New Standard for Cloud File Storage

Modern IT organizations are under constant pressure to deliver scalable, performant, and reliable infrastructure to fuel data-driven innovation. As workloads get more data-intensive—spanning AI model training, and compute-intensive simulations, the demands placed on underlying storage platforms have never been higher. Traditional storage architectures frequently struggle with performance at scale, inflexible growth, and the complexity of managing distributed systems across ever-expanding environments.

Recognizing these challenges, Oracle has introduced OCI File Storage with Lustre, a fully managed file storage service purpose-built to meet the rigorous needs of today’s most demanding workloads. By removing the complexities of deployment, scaling, and day-to-day management, File Storage with Lustre enables customers to focus on delivering business value, not managing infrastructure.

File Storage with Lustre implements open-source Lustre in the cloud with the full resilience and availability of OCI Cloud. File Storage with Lustre stands apart through several foundational strengths:

  • Elastic Scalability: Currently, File Storage with Lustre can grow online from tens of terabytes up to 20PB of capacity. You can start small at 31TB and expand the filesystem as needed, meeting data growth without re-architecting storage or disrupting existing workflows.
  • Proven Performance at Scale: Linear performance scaling is demonstrated through extensive benchmarks. As storage capacity increases, both throughput and IOPS scale linearly, ensuring fast access as workloads and datasets grow.
  • Simplified Operations: By leveraging OCI’s managed infrastructure, File Storage with Lustre eliminates the need for ongoing hardware tuning, manual failover configuration, or downtime during scaling events. This includes managed service taking care of activities such as security patching, Lustre bug fixes, and upgrades. File Storage with Lustre makes monitoring easy by integrating with native OCI Monitoring Service.  
  • High Availability and Durability by Design: Integrated fault domain architecture, storage volume replication, automatic failover, and no single point of failure maximize uptime for mission-critical workloads.

 

Performance characteristics

File Storage with Lustre offers various performance tiers to meet your specific requirements. Customers are currently using file systems in production with various sizes and performance tiers, including file systems as large as 20PB.

 

Performance Tier

Aggregate performance for 1PB FS

125 MBps (1Gbps) per provisioned TB

128GB/s

250 MBps (2 Gbps) per provisioned TB

256GB/s

500 MBps (4 Gbps) per provisioned TB

512GB/s

1000 MBps (8 Gbps) per provisioned TB

1TB/s

 

Benchmark Results: Transparent, Industry-Standard Performance

To provide customers with confidence in real-world capabilities of File Storage with Lustre, Oracle subjected it to rigorous testing using industry tools:

  1. IOR Benchmark: Assesses parallel file I/O throughput typical of HPC workloads. File Storage with Lustre demonstrated linear throughput scaling from 125TB (16 clients) to 250TB (32 clients), with performance improving predictably as capacity and client count increased.
  2. FIO Benchmark: Simulates block-level I/O with mixed random reads/writes (in this case, a 60/40 ratio). Tests at varying block sizes (1M, 128k, 4k) confirmed that IOPS, too, grow proportionally with provisioned capacity.
  3. MDTest: Focuses on metadata performance – critical for workloads with extensive file creation, deletion, and stats. Once again, File Storage with Lustre showed scalable metadata performance, proportional to both capacity and the number of metadata targets (MDTs).

Throughput scaling

Metadata scaling

Figure 1 – Throughput performance

Figure 2 – Metadata performance

The benchmarks for File Storage with Lustre demonstrate that it delivers linear and predictable scalability in both throughput and metadata as additional storage capacity is provisioned. Using standard industry tools (IOR, FIO, and MDTest), we tested parallel file I/O, mixed random/block operations, and metadata performance. The results show that as file system size and the number of clients increase, performance scales proportionally, covering all critical aspects such as data transfer speed, IOPS, and metadata operations. Importantly, these results were achieved with out-of-the-box deployments, highlighting its robust performance and ease of use for demanding workloads like AI, ML, and HPC.

 

Get Started with OCI File Storage with Lustre

You can easily create a file system from the OCI Cloud Console, Terraform, CLI or APIs. Create a Lustre file system today on the Oracle Cloud console, simply navigate to Lustre File Storage in Oracle Cloud Console.  For more detailed technical information, consult the File Storage with Lustre documentation or reach out to OCI to discuss your file systems and aggregate performance requirements.

For more information, see the following resources: