More on Blocks

A few weeks ago I was blogging about how block protocols like SCSI were designed around the on-disk sector format and limited intelligence of 1980's disk drives. Clearly, if we were starting from a clean sheet of paper today to design a storage interconnect for modern intelligent storage devices, this is NOT the protocol we would create.

The real problem though isn't just that the physical sector size doesn't apply to today's disk arrays. The problem today has more to do with the separation of the storage and the application/compute server. Storage in today's data-centers sits in storage servers, typically in the form of disk arrays or tape libraries which are available as services on a network, the SAN, and used by some number of clients - the application/compute servers. As with any server, you would like some guarantee of the level of service it provides. This includes things like availability of the data, response time, security, failure and disaster tolerance, and a variety of other service levels needed to insure compliance with laws for data retention and to avoid over-provisioning.

The block protocol was not designed with the notion of service levels. When a data client writes a collection of data, there is no way to specify to the storage server what storage service level is required for that particular data. Furthermore, all data gets broken into 512-byte blocks so there isn't even a way to identify how to group blocks that require a common service level. The workaround today is to use a management interface to apply service levels at the LUN level which is at too high a level and leads to over-provisioning. This gets really complicated when you factor in Information Lifecycle Management (ILM) where data migrates and gets replicated to different classes of storage. This leads to highly complex management software and administrative processes that must tie together management APIs from a variety of storage servers, operating systems, and database and backup applications.

If we were starting from a clean sheet of paper today to design a storage interconnect we would do a couple of things. One, we would use the concept of a variable sized data Object that allows the data client to group related data at a much finer granularity then the LUN. This could be an individual file, or a database record, or any unit of data that requires a consistent storage service level and that migrates through the information lifecycle as a unit. Second, each data object would include metadata - the information about the object that identifies what service levels, access rights, etc are required for this piece of data. This metadata stays with the data object as it migrates through its lifecycle and gets accessed by multiple data clients.

Of course there are some things about today's block protocols we would retain such as the separation of command and data. This allows block storage devices and HBAs to quickly communicate the necessary command information to set up DMA engines and memory buffers to subsequently move data very efficiently.

Key players in the storage industry have created just such a protocol in the ANSI standards group that governs the SCSI protocol. The new protocol is called Object SCSI Disk (OSD). OSD is based on variable-sized data object which include metadata and can run on all the same physical interconnects as SCSI including parallel SCSI, Fibre Channel, and ethernet. With the OSD protocol, we now have huge potential to enable data clients to specify service levels in the metadata of each data object and to design storage servers to support those service level agreements.

I could go on for many pages about potential service levels that can be specified for data objects. They cover performance, insuring the right availability, security, including access rights and access logs, compliance with data retention laws, and any storage SLAs a storage administrator may have. I'll talk more about these in future blogs.


Ken, I totally agree that the data access interface has to be re-architected to allow more contextual information handling towards SLA/SLO's. OSD and other object storage projects (e.g. SNIA's XAM) are first steps in the right direction. However there's a general concern about the proliferation of uncoordinated metadata sets across multiple IT layers. Do you see a need for standardizing the access to such contextual management information so that management frameworks can provide end-to-end management schemes? Many Thanks, Vincent.

Posted by Vincent Franceschini on April 23, 2006 at 08:55 PM PDT #

Hi Ken, This is a very optimal thing you have envisioned about the way the protocol or standards designers should have approached.Your blog is info rich & I'd like to see you writing more vigorously. Yours is the TOP 1 blog on google when we search for SAN info Your blog rocks SAN tutorials

Posted by Powell on October 04, 2006 at 11:37 PM PDT #

approached.Your blog is info rich & I'd like to see you writing more vigorously. Yours is the TOP 1 blog on google when we

Posted by runescape money on November 09, 2007 at 03:11 PM PST #

this is cool, this is what we want dude......

Posted by Tiffany Bracelets on November 13, 2009 at 09:13 AM PST #

Post a Comment:
  • HTML Syntax: NOT allowed



« July 2016