Monday Aug 21, 2006

Storage Grid 2: Cleversafe (!?)

BitTorrent was a great read the other week, and I am still having the occasional fun with it (disclaimer: I have not downloaded Pirates of the Caribbean 2 despite what all of those trackers say). Cleversafe is just as clever and answers the opposite problem with a similar solution.

[Read More]

Saturday Jul 15, 2006

Standards vs. Proprietary APIs, take the standard for the long haul

We're currently nearing Alpha / Beta for the StorageTek Management Portal 2.0 (previously the StorEdge Management Portal 1.0). The jump in the version number from 1.0 to 2.0 is necessitated by a jump in adoption of the SMI-S 1.1 standard for collecting information from the storage ecosystem as opposed to our previous "agent" technology that was a mash of SNMP and proprietary agent libraries. SMI-S is, basically, a blend of the three important pieces that make up a Service Oriented Architecture (granted, SMI-S is not currently a Web Services model, but that is in the works with WS-Management). SMI-S provides:
  • Lookup of agents via Service Location Protocol (SLP)
  • Communication with agents via Web-based Enterprise Management (WBEM) over HTTP(s)
  • Navigation of software/hardware elements via a standard model
Jumping to standards provides a variety of benefits for our team:
  • We have been able to rapidly inspect and gather content from devices we have not encountered before
  • We manipulate all agents with about 90% of the same code (maybe 10% unique code is necessary due to vendor anomalies in compliance to SMI-S)
  • We do not have to create extension points for new protocols (WBEM over HTTPS vs. SNMP vs. RPCs vs. RSH vs. etc...) to communicate with new devices
  • We are given a standard data model for our information and we don't have to spend much time creating an internal data model that is an abstraction of all of the proprietary APIs.
Moving to SMI-S is not exactly easy. With all standards, there is a certain level of over-design and obvious points where things may have been negotiated too long for naught. Further, with SMI-S, we mine only a small percentage of the information from the model. Advocates of leveraging proprietary protocols often hold up examples of how easy it is to mine such a small percentage of data by using proprietary libraries and, in many ways, they are correct. Prototyping with proprietary libraries is often very focused, quick and easy. Tackling the full SMI-S standard is like entering a world with TOO MUCH information and an often unwieldy model. The learning curve to get a small amount of information with SMI-S is long, as opposed with proprietary libraries with which the learning curve to get a small amount of information is, well, short (SMI-S 1.1 is approximately 1,500 pages...typical proprietary API documents are around 100-200 pages). But here's where things get interesting, we often attend SNIA Plugfests where we bring SMP 2.0 and plug it in (this is easy since we run on a Solaris x86 laptop now...being based on Java ES, it was a quick and easy move for us with only a couple of glitches). At the plugfests, information from systems we have not tested with starts flowing into the portal. This would be impossible if we were based on the proprietary APIs of customers. For example, we can acquire performance information about some EMC arrays without adding code. Not only is device compatibility easier for information collection, once our team mastered their section of the model, it became easier and easier to collect more and more information. Different customers have different information needs and mining that information from 10 vendor's proprietary APIs you end up needing 10 experts manipulating the libraries. And on top of the libraries, the architect layers a domain model, something that SMI-S already provides a large percentage of. With SMI-S client compliance has come a wealth of information and capabilities at our fingertips. This isn't really a turtle vs. hare race that prototypers want you to believe we are in. The turtle was persistent but slow and the hare was arrogant and fast and could have actually won the race, there is NO WAY that proprietary APIs, protocols and models can win this race. Maintaining a proprietary API if you're a vendor is a burden to your adopters and to yourself as you are reinventing the wheel. Adopting proprietary protocols if you are a client is asking for a heavily staffed agent team that largely duplicates each others work but with different vendors. This is more like having a race across the USA, I get a 747 and you get a Subaru Outback. If you are taking bets after 10 miles, chances are the Subaru is ahead. Talk to me when we get to the Mississippi River. I'm in Broomfield Colorado, the Subaru is unlikely to be out of Colorado and I will have a 2 state lead with my 747. I like my odds with SMI-S. There are many downsides to SMI-S. A lot of information is only avaialable in vendor "extensions" of the model. To this I answer, least I don't have to navigate a different protocol and add another lookup API and, further, the extension points to the model are standard even if the contents of the extension points are not. There is not complete device support, especially where legacy devices are supported. This issue will likely gnaw away at us. Many companies have broad SMI-S support, but we also realize there are many companies with older devices that are still a valuable part of their storage ecosystem. As SNIA says, customers MUST be part of the solution, you will get better and more standard management tools with SMI-S adoption. Pressure your suppliers and vendors to provide SMI-S support for older arrays, it is worth their effort to keep you as a customer. Standards will make your ecosystem more manageable. There is no in-band management. Well, this is a choice of the implementers of the SMI-S agent. There is no requirement that SMI-S agents talk out of band to the devices manipulated by the agent, the only requirement is that the management station talks to the SMI-S agent out of band. This can be tricky with high security networks. BUT, the more involved clients are with adopting SMI-S solutions, the more pressure we will have to show device vendors that this versatility is necessary. Working together with clients with the high-security networks can help build acceptable deployment architectures that maintain the integrity of the customer networks while leveraging the standard in the management tools. Footprint...SMI-S agents tend to be heavier than a proprietary agent. This is largely true, I have to admit. It doesn't HAVE to be true but it must be fixed in agent implementations and across the industry before SMI-S can deliver a knock-out blow to proprietary vendor libraries. Finally, my big issue with SMI-S is the blending of the lookup, protocol and model. SMI-S should be about the model only. Adoption of WS-Management will be a huge step to standardizing tools and capabilities with the rest of the industry on a Web Services base and we will be able to retire the WBEM is CIM-XML and not SOAP debate...and its not soon enough. The lesson with SMI-S that we are learning on the StorageTek Management Portal 2.0 is one that should resonate in all areas where there is a standard way to do things and a proprietary way to do things. If you have a one-off, go ahead, use the proprietary way...but if you are in it for the long-haul, learning the standard, working with the standard and participating in the standard will provide velocity through breadth and depth of information as well as unexpected vendor reach that simply cannot be achieved by pursuing each vendor's API separately. There will be turbulence along the way, I guarantee it. But are you positioning yourself for the future, or are you positioning yourself for next week?

Wednesday Jun 28, 2006

WS-Management Specification

The WS-Management Specification is from the Distributed Management Task Force (DMTF).  It is a Web Services extension specification to aid in the field of systems management.  This is my take on WS-Management after an initial reading of the specification. [Read More]

Wednesday Jun 14, 2006

Web 2.0, its not just AJAX, let the storage folks have some fun too

Storage systems are rarely mentioned in the same sentence (and most often not even in the same conversation) as Web 2.0.  There could be quite a few reasons but, in the end, it seems like systems folks take storage for granted and storage folks take systems for granted.  On the other hand, storage could be perceived as just plain boring.

Still, one (I) could easily make an argument that storage is the single most important aspect of Web 2.0.  Let's go back to the Wikipedia definition for a moment, "The term Web 2.0 refers to a second generation of services available on the World Wide Web that lets people collaborate and share information online."

Now, I have to tell you, all of that information flying around the web is getting stored somewhere.  Storage sales continue to fly, and if you pay attention, Oracle isn't doing too bad either.  Where does all of that information and content go when folks aren't accessing it?  Its at rest on a platter, in flash memory, on a tape...its somewhere, just waiting to be accessed.

An obvious Web 2.0 need is efficient Information Lifecycle Management (ILM).  Think about something as simple as Instant Messaging.  Instant Messaging is about "now", I communicate...well...instantly.  Rarely do I go back to recall what I talked about, but on occasion, I go surfing through my old IMs to get URLs, recall important details, and more.  There is NO REASON that all of my IM conversations should be on high priced storage with immediate access yet, still, I would like timely access to my old IMs.  This is tiered storage, and Sun has a good story around this.  Further, with all of the government regulations in the industry about retaining information, the StorageTek acquisition looks very solid.  I used to work at Imation in Minnesota.  The execs were enamored with tape and diskettes, I was a young kid with software infrastructure on his mind and was largely dismissive of this tape thing.  I have to tell you, I look back at those execs with some amount of respect, tape is fundamentally important, especially in Web 2.0 if you couple it with a sound ILM strategy.  Look at the history of the Imation stock and where it goes after a major disaster.  Companies look at their disaster recovery plans and they basically say "I gotta get me some of that tape".

Ok, ILM, good, major player in Web 2.0, Sun, we have it covered.  A bit boring, but boring is important.  Quicken is boring too, as in, accountant boring, but just try to pry quicken from my fingers at tax time...

What about this grid thing that Sun is all over?  CPU grids are fun and exciting, but the relationship to storage is largely done with existing technologies (shared filesystems, SAN, and data movers).  Storage attachment to grid is often done by building one big fabric, maybe one big zone, and heavy use of LUN mapping/masking.  SMI-S can play a great role here.  Provisioning, changing access rights, carving capacity, shouldn't be different for every array, there is no differentiation here.  Information access and retrieval in a Web 2.0 world is the lifeline of a company.  Differentiate on quality of service, reliability, performance and other attributes that matter in a Web 2.0 world, then use a standard API so your storage can be plugged into grid management and system's management tools so your application's storage can follow the application and the demands that the grid understands.

Hmm, again, Sun's all over this area.  Shared filesystems, ZFS, SMI-S, all good.  Still pretty boring though.

What's interesting about all of the above is that storage is slaved in its traditional way to systems.  Filesystem access, block storage coupling, LUNs, SCSI targets, NFS, CIFS, and so on.  Web 2.0, though, is about semantically rich content, pictures, conversations, combining satellite maps with GPS signals (hmmm, what about Google maps and sounds so I could "hear" the location I'm looking at...that's creepy), micro payments, extending the Internet from a browser and into the very fabric of our existence...listening to a podcast on the Internet with my phone while I wait for a call from my boss...having my address book with me at all times no matter what device I have with me or where I am, or who's tablet I'm teaming on at Starbucks.  Doesn't it seem like a filesystem and blocks don't quite do this information rich world justice?

Rather than talking about CPUs and assuming you have to fit your data into a filesystem or database.  Why not bring storage and applications closer together and make all of those CPUs and blocks go away...virtualization you know?  Instead of creating a file format that contains metadata about a picture and tracks collaborations, then points to another file with a jpeg and an index that associates the two, why not build ONE storage API that you can extend with as much metadata as you want and you can retrieve the picture directly based on a query of metadata without having to traverse fragile networks of files or database tables?  Standardize that API...instead of serving filesystems, serve objects for a change. 

Now that would make storage exciting.  Then, don't tie it to my server.  Make it a remote API so as my grid adds CPU and my Web 2.0-AJAX-Whizz-bang-Collaborative-Photo-and-Movie sharing site attracts eyes, I don't have to do any crazy driver deployments and maintenance...I hate drivers...maybe just use a powerful, Java-based API that deploys to any platform...yeah, that would be the ticket.

Web 2.0 and storage.  If you think about it, it might be more fun than the servers if you just concentrate on it.  And please...stop with the AJAX, you just remind me of how dirty my house is...there is more to Web 2.0 than the AJAX...though, admittedly, AJAX is cool stuff...

Saturday Sep 17, 2005

Sun StorEdge Management Portal

Storage management and storage environments are natural places to leverage enterprise application integration techniques. Storage networks are inherently complex in nature, there can be hundreds of storage elements with dozens of applications to help manage the overall environment. One approach to managing the environment is to build a single application that normalizes the operations across the heterogeneous environment. The result is an application that aids in day to day management and can often orchestrate large sets of elements. Sun's StorEdge Operations Manager takes this approach, and its a good approach. The Sun StorEdge Management Portal takes an alternative approach to bringing a storage environment together. Rather than create a new environment that manages all storage elements from a single pane of glass, the Sun StorEdge Management Portal acknowledges the multiple applications that reside within the storage network (including StorEdge Operations Manager). The portal environment ties telemetry from devices together with live information from the Internet in an "at a glance" format. When a storage administrator sees something of interest, they launch the application that services the storage element to take action. For example, one of the management portal's portlets displays the available capacities of all storage pools from all StorEdge 6920 systems in the storage network. At the portal tier, the administrator can locate a storage pool on one of their 6920s with the appropriate capacity and with the appropriate attributes (RAID, spindles, etc...). With a simple click, the element manager for the StorEdge 6920 with the appropriate storage pool is launched to the proper location in a StorEdge 6920 element manager to carve the LUN. Cool. The idea is simple. Leverage the investment in the existing software. Integrate the information from the storage network. Let administrators personalize their environment to make it more effective for them. Sun's software makes it all possible. We use Sun's Java System Portal Server as a user interface tier, and other software from the Java Enterprise System to build out the integration tiers. The content and presentation within the portal leverages standards (JSR-168, SMI-S, HTTP, RSS, and more). Take a look if you want, here's a link to the Getting Started Guide chapter for ESM Base Applications 4.0 that describes the portal. I'm excited, obviously...some days I love working for a company that has technology that encompasses every second of the lifecycle of data (from data origination on a phone to sticking the data into a salt mine).



« July 2016