Tuesday Apr 21, 2009

Video: Top 5 Cool Features of the Sun Storage 7000 Unified Storage Systems

A couple of weeks ago, Marc (our producer from the HELDENFunk Podcast) and I sat down and put together a video about the top 5 reasons why the new Sun Storage 7000 systems are so cool. We even "invited" Brendan Gregg to show us his latest trick:

For the next video, I'll try to learn more phrases by heart and look less at the prompter screen for a more natural feel. I apologize for my German accent (some people say it adds credibility :) ). Still, people seem to like the video, at least it has been viewed about 200 times already.

There's a lot of discussion around the Sun Storage 7000, most of it is very positive. In Germany, we like to complain a lot so of course we also hear a lot of constructive criticism. Most of the comments I hear fall into one of the two following categories:

  1. The Storage 7000 systems are cool, but I know ZFS/OpenSolaris can do "X" and I really want this to be in the Storage 7000 GUI as well!
    Yes, we know that there are still many features we'd like to see in the Storage 7000 and we're working on making them available. Make sure your Sun contact knows about your wishlist, so she can forward it to our engineers. Please remember that the Storage 7000 systems are meant to be easy-to-use appliances: Taking your "X" feature from ZFS/OpenSolaris and building a GUI around it is a hard thing to do, especially if you want it to work reliably and if you want it to be self-explanatory and self-serviceable. Please be patient, we're most probably working on your favourite features already.

  2. The Storage 7000 systems are cool, but I want more control. I want to change the hardware/hack them/take them apart/add more functionality/get them to do exactly what I want, etc.
    Sure, that feature is called "OpenSolaris". Please go to OpenSolaris.org, download the CD, install it on your favourite hardware and off you go!
    But, can I have the GUI, too, maybe as an SDK of some sort?
    No. The Storage 7000 systems are not "just a GUI". They are full-blown appliances which means that they're more than just the hardware and a GUI. A big part of the ease-of-use, stability, performance and predictability of these products is in the way configuration options are selected, tested and yes, limited, as well as a careful consideration of which features to implement at what time and which not. Only then comes the GUI on top, which is tailored to the overall product as a whole. In other words: You wouldn't go to BMW and ask them to give you their dashboard, radio and the lights so you can bolt them onto a Volkswagen, would you?

You see, either you build your own storage machine out of the building blocks you have, and get all the functionality and flexibility you want at the expense of some configuration effort,
or you buy the car as a whole, nice, round, sweet package, so you don't worry about configuration, implementation details, complexity, etc. Asking for anything in between will get you into trouble: Either you'll spend more effort than you want, or you won't get the kind of control you want.

If you understand German, there's some discussion of this topic as well as a great overview of the MySQL future plus a primer on SSDs in the latest episode of the HELDENFunk podcast.

And if you like the Sun Storage 7000 Unified Storage Systems as much as I do, here are the slides in StarOffice format, as well as in PDF format, so you can tell your colleagues and friends as well.

Thursday Mar 26, 2009

Cloud Computing in 6 Minutes

Yesterday I visited Sun's European Education & Research Conference in Berlin where my colleague Manuel and I ran a session on Web 2.0 and Cloud Computing. Web 2.0 companies have really pioneered the use of cloud computing for their businesses, taking advantage of the low entry cost and high elasticity that clouds provide. These are really good things if you only have a few hundred or so users on one day, then all of a sudden you face hundreds of thousands of them, just because somebody featured your company on Techcrunch or some famous VC twittered about your service. So the two subjects go really well together so our session room was quite packed and we had some good discussions with attendees afterwards.

Sun Campus Ambassadors Alper Celik and Gökhan Dogan from KTH University in Sweden were busy interviewing a lot of people during the conference with their digital camera, and both Manuel and I got our few minutes of YouTube fame with them. Here's Manuel talking about Web 2.0:

And here's yours truly, trying to explain Cloud Computing in about 6 minutes:

Curious about Cloud Computing? Check out the Sun Cloud or start developing Web Services inside the Cloud from the comfort of your web browser the easy way using Zembly.

Alper and Gökhan were really busy, they published a bunch of other interviews on YouTube the very same day. Just search YouTube for "European Education and Research Conference" and you'll find more than a dozen of their interviews.

Gökhan also participated in his university's WaterWell project that used Sun SPOT technology to create a wireless sensor network that monitors water quality. Here's Gökhan explaining his project:

With a generation of students that show this kind of motivation, I'm not really worried about how to come out of this recession :).

Monday Mar 02, 2009

The Inner Life of ZFS: Cool ZFS On-Disk Block Structure Movies

Pascal Gienger of Konstanz University published a nifty DTrace script that captures ZFS' on-disk block activity and published it on his Southbrain blog.

The cool thing: He animated the data. That's right. Using a Perl script, he draws greener or redder dots depending on whether a particular range of blocks on disk sees more reads or writes. By aggregating data over many hours while doing interesting tasks such as backup, he created a series of very cool animations.

In his first post, he shows us the inner life of a Postfix mail queue as an animated GIF:

ZFS on-disk block animation

Then, he compared the write patterns of UFS vs. ZFS using a MySQL workload to produce a cool MPEG-4 movie.

In his latest ZFS animation work, he shows us 18 hours of a mirrored file server including some backup, night rest and user action (Download MPEG-4 Movie here).

Congratulations, Pascal, this is way cool stuff. You really should upload these to YouTube so people can embed them in their blogs :).

Update: Meanwhile, pascal told me that he uploaded his videos on YouTube already. He has a full playlist full of them. Enjoy!

Wednesday Oct 15, 2008

Chip Multi-Threading, Cooking and the Anatomy of a Viral Video

Here's a fun video about chip multi-threading, explained through cooking:

The Story

For those of you who don't speak German: Ingo, the hero of this movie, wants to cook German roulades. He uses his hands as registers, while his table serves as a level 1 cache. The instruction cache is his brain, where the recipe resides. Soon, he reaches the point at which it says: "Pour red wine into the pan". There's no red wine in the registers, no wine in the L1 cache, so he needs to ask his memory subsystem: "Hooooney, would you mind bringing me a bottle of Merlot from the basement, pleaaaase?"

While "honey", the memory subsystem, is busy bringing wine, Ingo explains that at this point, there's no difference whether he stirs the dish at 1.4 GHz, or at 4.5 GHz (this is the piece where his stirring gets frantic). Actually, he'd rather use his precious time to do other useful things with what he has in L1 cache already, for example cook dumplings, or prepare dessert. That would indeed help a lot in getting dinner ready sooner, even while waiting for "honey" to bring some wine.

And that is the whole point of chip multi-threading.

Now, imagine 8 Ingos, each with two hands (think pipelines) and doing 4 dishes per hand (read: threads). What a feast!

CMT Cooking Going Viral

I first saw Ingo giving this presentation in February, during Sun Germany's Partner University event. It was hilarious, the whole  room was laughing and we knew he needed to do it again. So, with the help of a few people, Ingo and Ulrike created this fun video.

They posted it on YouTube in July and we featured it on one episode of the HELDENFunk podcast for German system admins. Soon, Ingo reached a few hundred downloads and we thought: "Cool, we have a new fun video to share!"

Then, Alex Wunschel, aka the "Podpimp", one of the more well-known podcasters in Germany and a listener of the HELDENFunk podcast, twittered about Ingo's memory subsystem called "Schatz!" (the German equivalent of "honey"). That was even cooler.

Then, Thomas Knüwer saw Alex' Tweet, and blogged about it. On the "Handelsblatt" blog. Think something like "Fortune" Magazine in German. And he got 14 comments. Gulp. 

The result: Ingo's views skyrocketed, soon he was in the thousands, and last time I checked, he had more than 13,500 views, for a 3.5 minute video about chip multi-threading and a memory subsystem called "honey". Nice!

Today, Alec and I chatted about Ingo's video and apparently, he liked it very much. Well, I guess Ingo can start counting again. This time, english speaking viewers, too. Have fun!

Would you like Ingo to dub his video in English? Or do you prefer the German version? Just drop a comment below! 

Wednesday Feb 13, 2008

Great Web 2.0 Videos to Show to Customers, Partners, Colleagues, Friends & Family

The past few weeks were very busy ones for me. I was preparing a lot of stuff for the Sun Germany Partner University 2008 in Fulda, which took place this Monday and Tuesday. The bad news is that I hardly had any time to blog. The good news is that I now have many things to blog about over the next couple of entries.

Web 2.0 was one of the main themes that permeated the agenda. There were presentations about tools for web 2.0 developers (Check out NetBeans and its wonderful JMaki plugin for instance), discussions on web scalability using CMT servers and I also had the honor of presenting a Web 2.0 overview talk.

During the general session, as an introduction to Sun's vision, we found this video to be quite breathtaking:

This video called "Did You Know 2.0" was developed by teachers in the USA who are concerned with the education of today's kids and how to prepare them for an exponentially changing, globalized and networked future. It's great to see so many concepts in this video that are at the heart of what Sun is doing, combined with a forward-looking, heads-up attitude, designed to shake us up and tell us "Wait a minute: There's significant change going on right now. Prepare for it". A lot of people asked me where to get this video after the general session (I was in charge of A/V support during general sessions), so now you know: Visit the Shift Happens website for high quality versions of the video as well as some background.

Many thanks to Danilo for pointing me to this video (and unconsciously influencing this year's partner university agenda)!

Here's another Web 2.0 related video that I like to use during presentations: "Web 2.0 ... The Machine is Us/ing Us" by Michael Wesch from Kansas State University:

A great summary of the history of the web: From HTML to XML to RSS syndication, blogging, video sharing, user-generated content to today's way of networking communities. Never has Web 2.0 been explained in an easier to understand way. The best thing about this video is that it has been created by non-techies: Michael Wesch and his team are actually anthropologists.

This is what I always repeat to customers: Web 2.0 is not about technology. It's about humanity.

Thursday Dec 06, 2007

X4500 + Solaris ZFS + iSCSI = Perfect Video Editing Storage

Digital video editing is one of those applications that tend to be very data hungry. At SD PAL resolution, we're talking about 720 pixels x 576 lines x 3 bytes of color x 25 full frames per second = about 30 MB/s of data. That's about 224 GB for a 2 hour feature film. Not counting audio (that would only be around 3-4 GB). And we (in Germany) haven't looked at HD or Digital Cinema a lot yet...

During the last couple of weeks I worked with a customer who bought a Sun Fire X4500 server (you know, Thumper). The plan is to run Solaris ZFS on it, then provide big iSCSI volumes to the video editing systems, which tend to be specialized Windows or Mac OS X machines. Wonderful idea: Just use zpool create to combine a number of disks with some RAID level into a pool, then zfs create -V to create a ZVOL. Thanks to zfs shareiscsi=on, sharing the volume over iSCSI is dead easy.

But it didn't work.

First, Windows wouldn't mount the iSCSI volume. After some trying, we discovered that there must be an upper limit of 2TB to the size of iSCSI volumes that Windows can mount (we initially tried something like 5 ot 10TB). So be it: zfs create -V 2047G videopool/videovolume.

Now it mounted ok, we formatted the disk with NTFS (yuck!) and started the editing system's speed test. Then came the real issue: The test reported a write performance of 8-10 MB/s, but the editing system needs something like 30 MB/s sustained to be able to record reliably!

After some trying, we started the systematic approach:

  • A simple dd from one disk to another yielded >39 MB/s.
  • dd'ing from one small ZFS pool to another exceeded 120 MB/s (I later learned that cp is a better benchmark because it works asynchronously with large chunks of data vs. dd's synchronous block approach), so that was again more than we needed.
  • We tried re-attaching our ZVOL through iscsiadm to test the iSCSI stack's performance and ran into a TCP fusion issue. Ok, I've always wanted to play with mdb, so we followed the workaround instructions and we were able to attach our own ZVOL over the loopback interface. Slightly less performance (due to up the stack, down the stack effects, I presume) but still way more than we needed. So, it wasn't the X4500's nor ZFS' fault.

Finally, Danilo pointed me into the right direction: Nagle's algorithm. What usually helps maximize network bandwidth turns out to be a killer for iSCSI performance. For Solaris iSCSI clients, we know this already,  but how do we turn off Nagle on Windows?

The answer is deeply buried inside the Microsoft's iSCSI Initiator user guide: The "Addressing Slow Performance with iSCSI Clusters" chapter mentions a similar issue (although they talk about read not write performance) and they do mention RFC 1122's delayed ACK feature, which is related to Nagle's algorithm. The Microsoft document suggests a workaround which involves setting a variable in the registry, so it was worth a try (and my vengeance for having to use mdb before).

And low and behold, the speed test now yielded 90-100 MB/s (Close to a GBE's raw performance)! Yipee that was it! One little registry entry on the client side gave us a 10x improvement in iSCSI performance!

Now, can someone explain to me, why on Windows 2000 you need to set "TcpAckDelTicks=0" while on Windows 2003 the same thing is accomplished by saying "TcpAckFrequency=1" (which is the same thing, only seen from the other side of the division sign)?

So, to all you storage hungry video editors out there: The Sun Fire X4500 with Solaris ZFS and iSCSI is a great solution for reliable, fast, easy to use and inexpensive video storage. You just need to know how to tell your TCP/IP stack to not delay ACKs...

Friday Mar 09, 2007

CSI:Munich - How to save the world with ZFS and 12 USB Sticks

Here's a fun video that shows how cool the Sun Fire X4500 (codename: Thumper) is and how you can create your own Thumper experience on your laptop using Solaris ZFS and 12 USB sticks:

This is finally the english dubbed version of a German video that a couple of colleagues and I produced some weeks ago. If you don't mind the german language, you might enjoy the original german version, too (It turns out that the english language has a lot less redundancy than the german one, so please forgive the occasional soundless lip motions).

If you liked the video(s), let us know, we'll be glad to answer any questions, receive any leftover Oscars or accept any new ideas for future episodes.

Here are a few more details, in case you really want to try this at home:

The first hurdle to overcome is to teach Solaris how to accept more than 10 USB storage devices. On a plain vanilla Solaris 10 system, it turns out that there is a limitation: Connecting more than 10 USB sticks through 3 USB-powered Hubs yields a Connecting device on port n failed error. Thanks to a colleague from engineering, the fix is to set ehci:ehci_qh_pool_size = 120 in /etc/system.

The second issue is briefly explained in the video itself: Not all USB sticks (particularly the cheap ones) are created equal. Small variations in the components create small variations in their storage space. So, when creating a zpool, you need to use -f to tell zpool to ignore differing device sizes.

If you pay close attention to the video, you'll notice around 7:20 that pulling a hub wasn't so harmless at all: "errors: 8 data errors, use '-v' for a list" can be seen at the bottom of the teminal window. In fact, zpool status reports 6 checksum errors in c21t0d0p0. Well, using cheap USB sticks means that block errors can occur in practice and once you don't have enough redundancy (like after unplugging a USB hub for show effect) they may hurt you. Fortunately, they didn't hurt our particular demo, since on one hand ZFS' prefetch algorithm had most of the video in memory anyway, while on the other hand zpool scrub fixed any broken blocks after re-plugging the USB hub. So, the cheaper the storage the more redundancy one should add. In this case, RAID-Z2 would have been better. Perhaps we can get some more USB sticks and hubs from any sponsors?

Finally, it took us a couple of retrys until the remove-sticks-mix-then-replug stunt worked, because it turned out that the laptop's USB implementation wasn't as reliable as we needed it to be. And yes, it does help to wait until they've finished blinking before removing any sticks :).

All in all, it was great fun for us producing this video and thanks to the tireless efforts of Marc, our beloved but invisible video editor, we now can proudly present an english version. Actually, we were quite surprised by this video's success: We published it in early February and just a day later, it got noticed by a couple of Solaris engineering people. Now, we have more than 9000 views of the german version (counting the Google video and the YouTube edition together) and are still counting. Hopefully, we can cross the 10,000 views barrier with the english version, now that we have increased the potential audience :).

After watching the video, feel free to try out Solaris ZFS for yourself. There's nothing like building your own pool, then watching ZFS take care of your data. At home, ZFS keeps my photos, music and TV videos nice and tidy, including weekly snapshots thanks to Tim Foster's automatic snapshot SMF service. Just this tuesday, my weekly zpool scrub cron job told me it had fixed a broken block on one of my disks. One that I'd never found out with any other storage system.

To get started, get OpenSolaris here or download it here. All you need to do is check out the docs though real system heroes only need two man-pages: zpool(1M) and zfs(1M).

P.S.: CSI of course stands for "Computer Systems Integration". Any similarities to the popular TV show are purely coincidence. Really. Hmm, but maybe having a dead body or two in one of the next episodes might spice up things a little...

P.P.S.: The cool rock music at the beginning is from XING a great rock band where one of our colleagues plays drums in. Go XING!

Update: Here is a much higher quality version, in case you want to show this video around on your laptop.


Tune in and find out useful stuff about Sun Solaris, CPU and System Technology, Web 2.0 - and have a little fun, too!


« April 2014