Friday Sep 04, 2009

HPC Virtual Conference: No Travel Budget? No Problem!

Sun is holding a virtual HPC conference on September 17th featuring Andy Bechtolsheim as keynote speaker. Andy will be talking about the challenges around creating Exaflop systems by 2020, after which he will participate in a chat session with attendees. In fact, each of the conference speakers (see agenda) will chat with attendees after their presentations.

There will also be two sets of exhibits to "visit" to find information on HPC solutions for specific industries or to get information on specific HPC technologies. Industries covered include MCAE, EDA, Government/Education/Research, Life Sciences, and Digital Media. There will be technology exhibits on storage software and hardware, integrated software stack for HPC, compute and networking hardware, and HPC services.

This is a free event. Register here.

Tuesday Nov 11, 2008

Unified Storage Simulator: Too Fun to be Legal

As I mentioned, we released a simulator as part of the Sun Storage 7000 Unified Storage System (\*) launch event yesterday. I geeked out and took it for a quick spin today to get a first-hand view of its capabilities. Short summary: this is a really cool way to play with one of these new storage appliances from the comfort of your existing desktop or laptop with no extra hardware required. It's all virtual, baby. Check this out...

The simulator (download it from here) is basically a VMware virtual machine that has been pre-loaded with the Unified Storage System software stack and configured with 15 virtual 2 GB disks. Here is what I did to try it out.

First I downloaded the simulator and booted the virtual machine using VMware Fusion on my MacBook Pro. The boot is straightforward with one tiny exception. At one point I was asked for a password, which stumped me since there is no mention in the instructions on the download page about supplying a password. As it turns out, this is where you specify the root password for the appliance. Pick something you will remember since you will need it to access the appliance's administrative interface later.

Once the appliance has booted it will display some helpful information about next steps, the most important of which is to access the appliance via its web interface to configure it for use. With the appliance running in a virtual machine on my Mac, I used the Safari web browser under Mac OS X to contact the appliance at the hostname (or IP address) supplied by the DHCP server when the appliance booted and using the port number shown in the documentation (port 215.)

I then logged into the appliance using the root password I had specified earlier. The BUI then walked me through a set of configuration steps that included networking, DNS, NTP, and name services. The process was simple and quick since the defaults were correct for most questions. Once I finished the basic configuration, I reached the following screen:

This is where it starts to get fun. This interface helps you choose which replication profile makes most sense for the storage that will be managed by your appliance. Each option is ranked by availability, performance, and capacity. The pie chart on the left illustrates how storage will be allocated under each scheme. In the case of Double Parity RAID, you can see that data and parity are placed on 14 disks and the last disk is held as a spare. In contrast, when I selected the "Striped" option, I saw this:

You can see that this strategy delivers maximum capacity and also great performance since I can get all those spindles working at once on my IO requests, but at the expense of low availability, which might be perfectly fine for a scratch file system. I opted for a Double Parity RAID scheme for my filesystem.

Once I configured the storage I visited the Shares tab and created an NFS filesystem called "ambertest." Again, this was straightforward. Straightforward enough that I forgot to take a screenshot of that step. Sorry about that.

I then mounted my new NFS filesystem under Mac OS X:

% mount -t nfs ip-address-or-hostname-of-the-virtual-storage-appliance:/export/ambertest /Volumes/amber

As a test, I copied several directory trees from my local Mac file system into the NFS filesystem exported by the virtual appliance and also ran several small test scripts to manipulate NFS files in various ways to generate load so I could play with the Analytics component of the appliance.

This is the part that should be illegal. Because the appliance stack is built on top of Solaris, DTrace is available for doing deep dives on all sorts of usage and performance information that would be interesting to an administrator of such a storage appliance. Here is one silly example that will give you a little flavor of what you can do with this capability. It is a much wider and deeper facility than this simple example shows.

Consider the following page:

I used the Analytics interface to graphically select a metric of interest--in this case, number of NFS v3 operations per second broken out by filename. The main graphical display shows how that metric varies over time and allows me to move backwards or forward in time, look at historical data, zoom in and out, pause data collection, etc. The lower pane in this case shows me the directories that are currently being touched within the NFS filesystem by ongoing operations. Individual files are listed in the small pane to the left of the timeline.

When I selected the pltestsuite line in the bottom pane, the timeline updated to show me exactly when in time the operations related to the files in that directory actually occurred. Since my test was a simple 'cp -r' of a directory tree into the NFS-mounted directory on the Mac, the display shows me when the files within the pltestsuite directory were cancelled and the NFS load generated by that part of the overall copy operation. I can easily see which file activity is contributing to load on the appliance--very useful for an administrator, for example.

In addition to examining NFS operations by filename, I can break down NFS statistics by type of operation, by client, by share, by project, by latency, by size, or by offset. I can do the same for CIFS or HTTP/WebDAV requests. NFS v4? No problem. Network traffic, disk operations, cache accesses, CPU utilization? It's all there in one easy-to-use, integrated web-based interface. To me, the Analytics are one of the coolest parts of the product since observability is often the first step to good performance and effective capacity planning.

If you are curious about the capabilities of the Sun Storage 7000 Unified Storage Server line, I do recommend trying the simulator. In addition to offering an effective way to explore the product without buying one (we expect you'll want to buy one after you finish :-) ), it is interesting to see how desktop virtualization neatly enabled us to create this simulator experience.

(\*) Named, no doubt, by Sun's department of redundancy department. I liked the code name better.

Monday Nov 10, 2008

Kicking the Crap Out of Storage Economics

While I'm mostly a "compute" guy, I know very well that at the end of the day servers are really just data manipulators and that for our customers storage plays an absolutely central role in their businesses. Which is why I've watched with fascination as some of Sun's best engineers banded together with the deliberate intention of rethinking how storage products are built, how they are priced, and what they can do. Today, Sun has announced the fruits of their labors--the Sun Storage 7000 Unified Storage Systems. Or better, check out Mike Shapiro's blog for his thoughts on what he and the rest of the team have accomplished. And be sure to check out the Sun Unified Storage Simulator Mike mentions if you want an easy way to see what all the hoopla is about.

View these products as an example of what Sun can do as a systems company when we bring our wide array of expertise and intellectual property together to create unique solutions that we believe customers will value and our competitors will loathe, especially those in the business of charging premium prices for capabilities we can now deliver at an entirely new price point.

In this case, we've integrated Sun server technology, flash memory and storage together under the control of a software stack based on OpenSolaris software innovations like ZFS and DTrace to create a set of NAS appliances that offer both extreme performance as well as built-in functionality other storage vendors either don't have or treat as expensive add-ons. We've talked a lot about Open Storage recently and this new product line is a poster child for what we've been talking about.

The capabilities of these new systems is amply documented elsewhere, so I will not delve into details here, except to mention that an enormous amount of engineering work and talent has been applied to this product line, the result being a system that offers a simple appliance-like installation and management experience; unmatched observability of overall system status as well as much more detailed drill-downs based on DTrace technology; high availability; and hybrid storage pools--a transparent combination of DRAM, FLASH and HDD that delivers significant read/write acceleration and some amazing performance numbers.

I could blather on about how great these products are, but thankfully you don't need to take my word for it. What matters is how these systems perform in particular deployment environments. These storage systems are available through Sun's Try and Buy program, so you can judge for yourself. We pay shipping, you put the system through its paces and make your own decision. You don't like it, we pay return shipping. You buy it, we give you a big discount. Seems like a deal to me.

Monday Jun 16, 2008

HPC Consortium: University of Ulm's Solaris Geek

Yesterday afternoon, Thomas Nau who is Head of the Infrastructure Department at the University of Ulm and a self-described Solaris geek, gave a talk titled "Storage the Solaris Way" at the Sun HPC Consortium meeting here in Dresden. The main points of his talk were an overview of the ZFS value proposition and a quick tour through cool things one can do with Solaris out of the box, for example using iSCSI and using the various network attached storage solutions available as part of Solaris.

Thomas first reminded the audience of what he and most other people in the HPC community want in a storage solution: safety and reliability, fast error detection and correction, performance, expandability, and interoperability via open standards. All of which are offered by ZFS.

With respect to safety and reliability, Thomas mentioned the following ZFS attributes:

  • 256-bit checksums for everything, not just metadata
  • ditto blocks to create copies of mission-critical metadata
  • transactional i/o semantics using COW (copy-on-write)
  • instant snapshots and clones
  • exploitation of on-disk and on-array caches for performance

With respect to built-in Solaris storage options, Thomas took the audience through a whirlwind tour of Solaris network attached storage (NAS) capabilities as well as block-level access using iSCSI. He also managed to demo all of this using his laptop, which was running two virtual machines called Angelina and Brad.

As shipped, Solaris has built-in support for NFSv4, Samba, and (in OpenSolaris) CIFS. As Thomas pointed out, the Samba implementation has been modified to support ZFS as a virtual file system back end and the CIFS server has been implemented in the kernel for maximum performance advantage.

To demonstrate iSCSI, Thomas set up several storage pools and then exported them via iSCSI as normal disks from Angelina to Brad where he mounted the disks in a mirrored configuration, which was all quite easy to do. Your correspondent, however, was not fast enough to capture the details of the demo. I expect that slides will be made available on the HPC Consortium website at some point.

In terms of performance, a test done at Ulm showed that the iSCSI approach which used several small storage arrays onto 2x2 redundant x4500 servers delivered comparable performance to a previous FC-AL solution that had been used with several small storage arrays.

Sunday Jun 15, 2008

HPC Consortium: Storage

Storage was the theme of the first set of talks at the Sun HPC Consortium meeting here in Dresden.

Peter Braam, Vice President for Lustre, spoke first about Open Storage, which we at Sun believe marks an important shift within our industry comparable to the shift we've seen towards Open Servers and that we expect to see in the future with networking. Open Storage in a nutshell: an approach that leverages open source software, open architectures, common components, and the interoperability of open standards used to create innovative storage products, with breakthrough economics. For example, we do not believe expensive, closed, hardware RAID controllers are a part of the open storage future. Instead, data integrity will be delivered with software like Sun's ZFS filesystem with its end-to-end data integrity model using inexpensive disks, and the considerable capabilities of increasing powerful standard compute servers.

Peter Bojanic, Director of the Lustre Group, spoke next. He started with some fun facts about Lustre, Sun's parallel cluster file system, which joined Sun's portfolio with the acquisition of Cluster File System, Inc. Some of the superlatives he mentioned: 25000 clients accessing a single Lustre file system on Red Storm at Sandia National Laboratory and CEA achieving an aggregate 100 GB/sec transfer rate with their Lustre configuration. He also pointed out that Lustre is used on 7 of the 10 largest supercomputers in the world (ref. Nov 2007 TOP500.)

Peter spent the bulk of his time discussing the Lustre roadmap, which my fingers were not nimble enough to capture in any detail. I expect the slides to be posted at some point on the Consortium website, so watch for them there.

Harriet Coverston, Sun Distinguished Engineer, spoke about Shared QFS and the Storage Archive Manager (SAM), two storage products that are well-known in the HPC community. Perhaps less well-known is that QFS has in an increasing footprint with non-HPC enterprise customers who need its scalability, performance, and reliability. Home Box Office (HBO) is a great example of this. Read about their use of QFS here.

After giving an overview of Shared QFS and SAM, Harriet spoke about her group's plans to move to what she called intelligent storage. The intent is to move from a traditional SAN approach to one that embraces T10-based object storage mechanisms, which will greatly increase QFS scalability from its current limit of 256 clients to the range of thousands of clients. Read more about object-based storage here.

Our cluster of storage-related talks ended with a presentation by Chris Wood, CTO for Sun's Storage and Data Management Practice. Chris focused on how Sun plans to deliver complete, modular, and scalable storage solutions rather than merely pieces of excellent product and technology. The fundamental problem is how to best satisfy a wide range of potentially conflicting user requirements, the most of common of which are high performance, low cost, high capacity, and a single architectural approach for all workloads. Sun's approach uses a modular architecture that can grow and shrink based on customer requirements and which leverages Sun's hardware and software. For example, SAM-QFS for high performance and data archiving, Lustre for additional scalability and performance, the x4500 storage server, high performance network interface cards (for example, Neptune) and current and future x86 and T2-based servers.

Saturday Mar 22, 2008

SAM-QFS Open Sourced: Delivering on the Promise of Open Storage

[opensolaris logo]

We call them Sun StorageTek QFS and Sun StorageTek SAM these days, but long-time HPC people will remember them simply as QFS and SAM (or collectively as SAM-QFS), the very well-respected file system and archive management software created by LSC, Inc and acquired by Sun back in 2001. These products are very well-known and widely-used in certain segments of the HPC market, particularly those with high-performance SAN requirements and with a need for kick-butt streaming IO performance. More recently, these products have had some huge successes in non-HPC areas, demonstrating that they not only scale and offer high performance, but that they have enterprise-class stability as well.

The big news is that we've open sourced SAM-QFS. The latest development bits are now available for download on, modulo a few pieces owned by 3rd parties which we either need to rewrite or will release if permission is obtained. Ted Pogue has done a nice writeup about all of this, including pointers to the discussion forums for the SAM/QFS OpenSolaris project.

Take a look at Sun's 2001 press release about the LSC acquisition (but do NOT click on the link--it has been reaped by spammers.) We were talking about Open Storage, even in 2001. As Bob Porras points out, we have now created an entire open source storage stack with this release. How cool is that?


Josh Simons


« July 2016