Thursday Feb 14, 2008

Be a System Hero

Ansaphone mockery ad 

If you read this blog regularly, you might have noticed that I like spending time participating in podcasts for the german website (For instance, see here, here and of course here). The podcast and the community is in german language, so if your native tongue isn't, the times of envy are over. Welcome to!

What is it?

It's a community website for those that are the "up" in "uptime", the unsung heroes of data centers, the people that never get a "Thank you for delivering all of my 1526 emails today!" call: The system heroes. If you like tinkering with computer systems, it's probably something for you.

What's in it for me?

First of all: A lot of fun, including some comics. A place to plug your blog (and who doesn't want the occasional extra spike in hitrates...). A place to meet other system heroes and chat about those pesky little lusers and their latest PEBKAC incidents while exchanging LART maintenance tips. And they have the coolest system hero game around: Caffeine Crazy. As seen, er, heard on HELDENFunk #9 and #10. Try it out!

Yeah, there's some Sun marketing, too, I admit. Mainly references to cool technology from Sun and the ability to test it 60 days for free (if it's hardware) or just use it eternally for free (if it's software), but someone has to pay the hosting bills and I assure you: It's for the good of system herokind.

Oh, and you gotta love these great ads at the bottom of each page (my favourite is above).

Cool, what do I do?

Do as Yoda would say: "Hrrm, a system hero you want to be? Sign up you need!" Well, being a system hero has never been so much fun...

Thursday Dec 06, 2007

X4500 + Solaris ZFS + iSCSI = Perfect Video Editing Storage

Digital video editing is one of those applications that tend to be very data hungry. At SD PAL resolution, we're talking about 720 pixels x 576 lines x 3 bytes of color x 25 full frames per second = about 30 MB/s of data. That's about 224 GB for a 2 hour feature film. Not counting audio (that would only be around 3-4 GB). And we (in Germany) haven't looked at HD or Digital Cinema a lot yet...

During the last couple of weeks I worked with a customer who bought a Sun Fire X4500 server (you know, Thumper). The plan is to run Solaris ZFS on it, then provide big iSCSI volumes to the video editing systems, which tend to be specialized Windows or Mac OS X machines. Wonderful idea: Just use zpool create to combine a number of disks with some RAID level into a pool, then zfs create -V to create a ZVOL. Thanks to zfs shareiscsi=on, sharing the volume over iSCSI is dead easy.

But it didn't work.

First, Windows wouldn't mount the iSCSI volume. After some trying, we discovered that there must be an upper limit of 2TB to the size of iSCSI volumes that Windows can mount (we initially tried something like 5 ot 10TB). So be it: zfs create -V 2047G videopool/videovolume.

Now it mounted ok, we formatted the disk with NTFS (yuck!) and started the editing system's speed test. Then came the real issue: The test reported a write performance of 8-10 MB/s, but the editing system needs something like 30 MB/s sustained to be able to record reliably!

After some trying, we started the systematic approach:

  • A simple dd from one disk to another yielded >39 MB/s.
  • dd'ing from one small ZFS pool to another exceeded 120 MB/s (I later learned that cp is a better benchmark because it works asynchronously with large chunks of data vs. dd's synchronous block approach), so that was again more than we needed.
  • We tried re-attaching our ZVOL through iscsiadm to test the iSCSI stack's performance and ran into a TCP fusion issue. Ok, I've always wanted to play with mdb, so we followed the workaround instructions and we were able to attach our own ZVOL over the loopback interface. Slightly less performance (due to up the stack, down the stack effects, I presume) but still way more than we needed. So, it wasn't the X4500's nor ZFS' fault.

Finally, Danilo pointed me into the right direction: Nagle's algorithm. What usually helps maximize network bandwidth turns out to be a killer for iSCSI performance. For Solaris iSCSI clients, we know this already,  but how do we turn off Nagle on Windows?

The answer is deeply buried inside the Microsoft's iSCSI Initiator user guide: The "Addressing Slow Performance with iSCSI Clusters" chapter mentions a similar issue (although they talk about read not write performance) and they do mention RFC 1122's delayed ACK feature, which is related to Nagle's algorithm. The Microsoft document suggests a workaround which involves setting a variable in the registry, so it was worth a try (and my vengeance for having to use mdb before).

And low and behold, the speed test now yielded 90-100 MB/s (Close to a GBE's raw performance)! Yipee that was it! One little registry entry on the client side gave us a 10x improvement in iSCSI performance!

Now, can someone explain to me, why on Windows 2000 you need to set "TcpAckDelTicks=0" while on Windows 2003 the same thing is accomplished by saying "TcpAckFrequency=1" (which is the same thing, only seen from the other side of the division sign)?

So, to all you storage hungry video editors out there: The Sun Fire X4500 with Solaris ZFS and iSCSI is a great solution for reliable, fast, easy to use and inexpensive video storage. You just need to know how to tell your TCP/IP stack to not delay ACKs...

Wednesday Dec 05, 2007

OMG: "Hostile" Takeover of

A few hours ago, has been "taken over" by

"Systemheld" in german translates to "system hero" and that's what this community portal is all about. Visitors of the German Sun home page are now being asked to "honor their sysadmin", because "without his unreached knowledge, his daily commitment to his job, his angel-like patience and a mind-expanding amount of coffee consumption, things would go dark pretty soon." (s/his/her/g where appropriate).

Having been a system administrator at my university's computer center in the mid nineties, I know what this means. I administered our university' proxy server in the beginning of the dot-com boom, and I've had my share of typical sysadmin-vs-luser stories :).

Speaking of which, check out the new series of comics that were produced for Even if you don't speak german, you'll understand what they mean...

Tomorrow we'll be recording a new episode of the HELDENFunk podcast and we have a couple of cool things lined up, so stay tuned.

Monday Oct 08, 2007

CEC 2007 in Las Vegas: Podcasting, JavaFX Hacking and HPC Software

Since I've arrived in Las Vegas on Saturday, October 8th, I've been busy with a number of things that are going on at the Sun CEC 2007 Conference:

  • CEC 2007 Messaging:  One of the cool things during the general sessions is the ability for attendees to send in their questions and comments via Email, SMS or Instant Messaging in real time, while the speaker is presenting.  Backstage, these messages are fed into a database. Then, two aggregate feeds are created: One goes to the CEC Message Visualizer, a Java Application written by Simon Cook which visualizes the flow of information in a very nice way so the audience can see where their messages are going. The other feed goes mainly to the presenters on stage so they know what the current questions are and answer them. That feed gets visualized through a Java FX Script application that I've been busy writing over the last weeks.
  • Podcasting: Tune in to the new CEC 2007 Podcast that is going live at this very moment. In the first episode, Hartmut Streppel, Eric Bezille, Matthias Pfützner and I sit together at the Gordon Biersch in Las Vegas (Prost!) while we discuss our plans and projects for CEC 2007, including Service Virtualization and Consolidation, ZFS, Flying Zones, the Message Aggregation Process and other cool stuff. Send me email or call my mobil phone if you want to participate in one of our next episodes!
  • HPC Software: In about an hour, Roland Rambau, Barton Fiske and I will present on HPC Software: Roland will cover the general state of HPC Software at Sun and talk about HPC storage solutions around CFS' Lustre filesystem, Barton will present the Sun Visualization Software solutions and I'll cover the Sun Grid Engine and some information on Sun Studio Developer Tools.
So, have fun listening to the podcast and see you at the HPC Software session if you happen to be in Vegas!

Tuesday Jul 31, 2007

New Year's Resolutions

Yesterday, we've announced good financial results for the last fiscal year 07. Very good financial results. I like working for a profitable company, it makes so many things so much easier.

Tomorrow, I'm going to have a meeting with my managers to discuss what to do next. Since we're early in the new financial year 08, I'm thinking about what to do next. So, here are some new year's priorities for my FY08 at Sun:

  • Web 2.0: I've been talking to customers, partners and Sun people in Germany about Web 2.0 a number of times. Every time, the feedback has been very clear: We want More! So I'm going to do more Web 2.0 related stuff: More blogging, podcasting, perhaps a successor to the now famous ZFS movie, more participation in social networking sites, including, XING and Facebook, more evangelizing and of course more insight into where this journey is headed to.
  • Technology: Sun is all about technology. We create, apply and leverage technology to enable the participation age. (Did you know that we've proclaimed the participation age before Tim O'Reilly published his famous Web 2.0 article?)
    We've seen Niagara changing the rules of processor technology and building the backbone of the web, again, and we've already disclosed some information on Niagara 2. We've seen the Constellation System debut during ISC 2007. You may have noticed that the Sun Ultra 40 Workstation is the best workstation on the planet, and BTW, we're changing the economics of true Video-On-Demand Streaming as well, just to name a few favourite technologies on my list.
    The biggest problem to solve now is: Spreading the word. Let me explain. Whenever I participate in a Sun day (A customer meeting in which Sun people present on new Sun technologies), two effects consistently happen: First, more people than originally planned show up (I once had people join in over a video conference line). Second, the meeting takes much longer than originally anticipated, because customers want to hear so much more about our technologies.
    Since we don't have much money to spend on advertising, sponsoring or other forms of traditional awareness generation, we need to do a lot more of these Sun days, and talk to customers one by one. Is this more difficult and time-consuming? Yes. Does this have a more lasting effect than traditional marketing? You bet. Only by talking to the experts at our customers are we able to verify that what we do is right and make sure our technology meets the people that want/need/develop for/join/use/participate in it. In FY08, I'm going to participate in more Sun days and talk to as many customers about Sun technology as I can.
  • Solaris: This may be a sub-topic of "Technology", but it really is a topic of its own: I use Solaris at home, on my laptop, evangelize it to customers, and it feeds my need as a computer scientist to learn about interesting things every day. In FY08, I'm going to use more new Solaris features at home and at work, write more about it (German readers: Check out this ZFS whitepaper), participate more in the OpenSolaris communities and make sure OpenSolaris gets the attention with developers, customers and partners that it deserves.
All in all, I'm sure FY08 is going to be interesting and fun. FY07 has been the year of technology announcements, FY08 will be the year of seeing them all in action. A year of interesting times.

Wednesday Jul 25, 2007

Now That's What I Call Rock-Solid!

A rock-solid Sun server still functioning flawlessly.Check out this story from A system admin enters their datacenter, only to find this scene of a crushed floor and a fallen rack full of Sun equipment. This must have happened some time ago, only the sysadmin didn't notice it because all of the servers were still running as if nothing happened! Later, Sun services checked every system in the rack and the only fault they found was a simple harddisk failure.

Sun systems have a reputation for being rock-solid, no doubt... 

P.S.: "Systemheld" translates to "system hero". is a community for the unsung system admin among us, in constant danger to be disbudgeted by moronic beancounters and haunted by incompetent lusers. Sometimes, their only defense is a LART-Whip.


Tune in and find out useful stuff about Sun Solaris, CPU and System Technology, Web 2.0 - and have a little fun, too!


« April 2014