Announcing the general availability of OCI File Storage high-performance mount target for AI and ML training

September 10, 2024 | 4 minute read
Prashant Jagannathan
Principal Product Manager
Sabrinath Rao
Sr. Director, Product Management, OCI Storage
Text Size 100%:

We’re excited to announce the general availability of Oracle Cloud Infrastructure (OCI) File Storage high-performance mount target (HPMT), which enables you to scale your throughput for each mount target up to 80 Gbps. You can add multiple mount targets to scale the aggregate throughput of a single file system. HPMT is engineered to deliver exceptional speed, making it ideal to use with GPUs for AI and machine learning (ML) training and checkpointing. 

HPMTs deliver up to 480 Gbps of sustained aggregate read throughput in production for a well-known large language model (LLM) customer to load the training data across thousands of GPU nodes. HPMT also enables a wide range of high-performance workloads, such as media and entertainment development, application development, and user shares. Oracle Cloud Infrastructure (OCI) uses HPMTs to get up to 500 Gbps aggregate throughput to accelerate the build pipeline performance for building Oracle databases and applications.

The OCI File Storage service is a distributed file system that can scale a single file system capacity to exabytes. HPMT now enables you to scale the file system throughput to Terabits per second.  

We’re introducing a new capacity-based pricing model, calculated in GB per month. For each 1 Gbps of read throughput, you get 1 TB of file storage. We’re enabling the following throughput-capacity pairs:

  • HPMT-20: up to 20 Gbps throughput includes 20 TB of file storage capacity
  • HPMT-40: up to 40 Gbps throughput includes 40 TB of file storage capacity
  • HPMT-80: up to 80 Gbps throughput includes 80 TB of file storage capacity

You’re billed on the capacity included in the throughput options. For example, if you use a HPMT-20 mount target, you’re billed for 20 TB of capacity. You can find pricing details in OCI File Storage Pricing.

Use cases

HPMTs aid in the following use cases:

  • AI and ML training and checkpointing: AI and machine learning models require TBs of data from a single file system to be loaded in thousands of GPUs in parallel. GPUs must be fed the data consistently for the entire duration of the training to avoid GPU idle time. Accessing data at up to 80 Gbps per mount target enables you to utilize the full potential of GPUs and enables multiple GPUs to access the same training data set. As a result, you get shorter training times and quicker iterations, accelerating your path to AI breakthroughs, while managing costs by eliminating GPU idle times.
  • Development shares: Customers often use file shares as a code repository for application development. HPMT enables build tools to quickly scan and process many files in parallel. 
  • Media streaming: In media streaming workloads, working with high-resolution video files and complex animations requires fast and reliable storage solutions. HPMTs provide the speed and scalability needed to handle large files, helping ensure smooth playback, rendering, and editing processes. As a result, creative professionals can work more efficiently and bring their visions to life.

How HPMTs work

HPMTs are backed by OCI File Storage, which has a distributed file system architecture, capable of scale up and out effortlessly. Each individual mount target can achieve speeds of up to 80 Gbps, providing the bandwidth necessary for high-performance workloads. You can also scale out the throughput of your file system by adding more HPMTs. For example, a customer uses eight HPMTs to enable up to 640 Gbps aggregate read throughputs for AI training, as shown in Figure 1.  

Architecture Diagram for deploying High performance Mount Targets

Figure 1: Architecture Diagram for deploying High performance Mount Targets

Navigate to the File Storage section and select the mount target that you want to upgrade. Configure your mount targets speed to enable read throughputs of up to 20 Gbps, 40 Gbps, or 80 Gbps. Enjoy unparalleled speed and scalability.

 Configuring a mount target in the console

Figure 2: Configuring a mount target in the console 

Conclusion

The new HPMT feature of OCI File Storage is a game-changer for organizations that demand speed, reliability, and scalability in their storage solutions. Whether you’re training AI models, running complex simulations, or creating stunning media content, HPMTs are designed to meet your needs and exceed your expectations.

We want you to experience these new features and all the enterprise-grade capabilities that Oracle Cloud Infrastructure offers. Interested in trying File Storage? Sign up for a free trial or reach out to OCI sales.

We value your feedback as we continue to make our service the best in the industry. Contact us to share your thoughts on how we can continue to improve or if you want more details about any topic. More feature updates are on the horizon for our cloud storage platform.

For more information, see the following resources:

Prashant Jagannathan

Principal Product Manager

Experienced Product management leader with extensive experience in storage, data management & data protection where he managed the entire product lifecycle from ideation to execution. Currently, he manages OCI File Storage Service

Sabrinath Rao

Sr. Director, Product Management, OCI Storage

Sabrinath partners with customers and various OCI teams to build service capabilities that help customers succeed in their cloud journey. 


Previous Post

First Principles: Robust data breach protection with Zero Trust Packet Routing

Pradeep Vincent | 13 min read

Next Post


Zero Data Loss Autonomous Recovery Service is now available for Oracle Database@Azure

Kelly Smith | 4 min read
Oracle Chatbot
Disconnected