X

Blazing Fast Data Uploads to Bare Metal Cloud Object Storage!

Wouldn’t it be great if you could automagically boost the speed at which you can upload data to the Bare Metal Cloud Object Storage, by an order of magnitude?

What about being able to better manage slow and flaky networks when uploading data?  If I accurately mirrored your struggle with data uploads then this blog is just for you!

I am excited to let you know that we are launching a new BMC Object Storage capability – Multipart upload. Multipart upload not only enables blazing fast uploads to our object storage, it also enables efficient management of slow/flaky networks by providing frustration free control over the data upload process. As if that were not a sweet deal by itself, as a bonus, you can now store very large objects - up to 10 Tebibyte (TiB) in size, which is 2-10 times the maximum object size supported by other hyperscale public cloud providers.

So what is multipart upload? In the most basic terms, the multipart upload functionality lets you break objects into smaller segments. You can then upload these object segments in parallel. Once all the segments are uploaded, the parts are coalesced server-side into one contiguous object that looks and behaves exactly like how you would expect an object to behave if it were uploaded all at once in a single operation. It’s this massive parallelism that significantly boosts the speed at which objects can be uploaded.

If you have great internet connectivity, Multipart uploads let you saturate your network, which was previously not possible. Conversely, when you break an object into its constituent parts, multi-part upload treats each part as an independent entity. If a part fails to upload because of a slow/flaky network issue, instead of having to start the upload again from a scratch, you can just retry uploading the part that failed. Furthermore, if you want to upload a large object over a slow network, you can pause and resume the upload process to take advantage of the network off peak times when more bandwidth is available for use.  An in-process upload stays active forever, until the process is explicitly aborted.

Here is a visual depiction of how multipart uploads works

Multipart Upload.jpg

We recommend that you use the multipart upload functionality to upload objects that are greater than or equal to 100 MiB in size.  The minimum supported size for an individual object part is 10 MiB.  If you would like to upload an object less than 10 MiB the normal object upload process using the PUT call will work well.

Uploads to a single contiguous object are tied together by an upload ID. This is how the process typically works:

  1. Break the object into multiple parts. We support up to 10,000 parts per object.
  2. Initiate a multipart upload.  Supply the object name and customer defined metadata. The Object Storage service returns an upload ID.
  3. Upload multiple parts in parallel. Specify a part number and the upload ID returned in Step #2. The part numbers that you supply need not be contiguous.
  4. Any number from 1 – 10,000 works. The system will return an Etag and MD5 hash of the object, that you need to store safely. You’ll need the Etag when you are committing the parts into a single object.
  5. Once all parts are uploaded, commit the upload. Specify the upload ID, part numbers and the associated Etags.
  6. The parts are coalesced in the ascending order of the part number. Any parts that you had previously uploaded, but no longer wish to commit can be excluded  by specifying the part number in the exclusion list. These exluded parts will be deleted automatically at the end of the commit process.
  7. All done! The object upload using multipart upload is complete.

If you are uploading multiple objects at once, you can always keep a track of the in-process uploads by listing them. You can also list the parts associated with a specific upload by supplying the upload ID as a parameter.  Any active upload can be deleted by explicitly invoking the ‘Abort’ call.  Please note that the listing and aborting functionality only work up until the object has been committed. Once the object has been committed, the normal object considerations apply and the object behaves similar to if a single PUT call had been made to upload the object.

Multipart upload process is supported via Object Storage native API and Java SDK. Support for Python SDK and CLI will arrive shortly. If you would like to learn more about the Multipart upload functionality, you can refer the Multipart upload FAQs, Object Storage User Guide, or the API reference guide.

I hope you are enjoying hearing about these Object Storage updates as much as I do writing them. Keep watching this space for future updates!

As always I would love to hear back from you. Please reach out to me by commenting below or via your Oracle account manager!

Update: The Object Storage CLI now supports Multipart Upload. You can now download the CLI from here

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services