Saturday May 05, 2007

The inculcation of systems thinking

As is known but perhaps not widely reported, all three of us on Team DTrace are products of Brown University Computer Science. More specifically, we were all students in (and later TAs for) Brown's operating systems course, CS169. This course has been taught by the same professor, Tom Doeppner, over its thirty year lifetime, and has become something of a legend in Silicon Valley, having produced some of the top engineers at major companies like NetApp, SGI, Adobe, and VMware -- not to mention tons of smaller companies. And at Sun, CS169 has cast a particularly long shadow, with seven CS169 alums (Adam, Dan, Dave, Eric, Matt, Mike and me) having together played major roles in developing many of the revolutionary technologies in Solaris 10 (specifically, DTrace, ZFS, SMF, FMA and Zones).

I mention the Brown connection because this past Thursday, Brown hosted a symposium to honor both the DTrace team in particular and the contributions of former CS169 undergraduate TAs more generally. We were each invited to give a presentation on a topic of our choosing, and seizing the opportunity for intellectual indulgence, I chose to reflect on a broad topic: the inculcation of systems thinking. My thoughts on this topic deserve their own lengthy blog entry, but this presentation will have to suffice for now -- albeit stripped of the references to the Tupolev Tu-144, LBJ, Ray Kurzweil, the 737 rudder reversal and Ruby stack backtraces that peppered (or perhaps polluted?) the actual talk...

Monday Mar 19, 2007

Ian Murdock joins Sun!

I know that I've been quiet for a while, and I promise that I (or rather, we) are close to talking about what we've been up to for the past year, but I wanted to first pop my head up to to highlight out some exciting news: Ian Murdock has joined Sun.

I have always been impressed with Ian's decidedly pragmatic views of technology; he has been a supporter of OpenSolaris, and in particular he took a decisive and courageous stand against some absurd Debian-borne anti-Nexenta licensing FUD. Looking back, it's hard to imagine that our OpenSolaris/Debian fantasia was not even two years ago, and that it was just eighteen months ago that we first saw a prototype of what that Utopia might look like. Ian's arrival at Sun is a huge lurch forward towards the wide-spread productization of this ideal: it's great news for OpenSolaris and it's great news for long-time fans of Debian -- it looks like we're going to be able to apt-get our cake, and DTrace it too!

Thursday Oct 12, 2006

DTrace on Rails, reprise

A little while ago, I blogged about DTrace on Rails. In particular, I promised that I would get diffs based on is-enabled probes out "shortly." In giving a guest lecture for a class at Berkeley yesterday, I was reminded that I still hadn't made this available. With my apologies for the many-months delay, the diff (against Ruby 1.8.2) is here.

And as long as I have your eyeballs, let me join Adam in directing you to Brendan Gregg's amazing Helper Monkey. Brendan's work is an indisputable quantum leap for what has become one of the most important, one of the most misunderstood, and certainly one of the most undebuggable platforms on the planet: JavaScript. If you do anything AJAX-ian that's evenly vaguely performance or footprint sensitive, you're wasting your time if you're not using Helper Monkey. (When Brendan first demo'd Helper Monkey to Adam and me, Adam's line was that Brendan had just propelled JavaScript debugging forward by "100,000 years" -- which is not to say that Helper Monkey is like debugging in the 1021st century, but rather that debugging JavaScript without Helper Monkey is like debugging in the late Pleistocene.)

Thursday Aug 17, 2006

DTrace on AIX?

So IBM has been on the warpath recently against OpenSolaris, culminating with their accusation yesterday that OpenSolaris is a "facade." This is so obviously untrue that it's not even worth refuting in detail. In fact, being the father of a toddler, I would liken IBM's latest outburst to something of a temper tantrum -- and as with a screaming toddler, the best way to deal with this is to not reward the performance, but rather to offer some constructive alternatives. So, without further ado, here are my constructive suggestions to IBM:

  • Open source OS/2. Okay, given the tumultuous history of OS/2, this is almost certainly not possible from a legal perspective -- but it would be a great open source contribution (some reasonably interesting technology went down with that particular ship), and many hobbyists would love you for it. Like I said, it's probably not possible -- but just to throw it out there.

  • Open source AIX. AIX is one of the true enterprise-class operating systems -- one with a long history of running business-critical applications. As such, it would be both a great contribution to open source and a huge win for AIX customers for AIX to go open source -- if only to be able to look at the source code when chasing a problem that isn't necessarily a bug in the operating system. (And I confess that on a personal level, I'm very curious to browse the source code of an operating system that was ported from PL/1.) However, as with OS/2, AIX's history is going to likely make open sourcing it tough from a legal perspective: its Unix license dates from the Bad Old Days, and it would probably be time consuming (and expensive) to unencumber the system to allow it to be open sourced.

Okay, those two are admittedly pretty tough for legal reasons. Here are some easier ones:

  • Support the port of OpenSolaris to POWER/PowerPC. Sun doesn't sell POWER-based gear, so you would have the comfort of knowing that your efforts would in no way assist a Sun hardware sale, and your POWER customers would undoubtedly be excited to have another choice for their deployments.

  • Support the nascent effort to port OpenSolaris to the S/390. Hey, if Linux makes sense on an S/390, surely OpenSolaris with all of its goodness makes sense too, right? Again, customers love choice -- and even an S/390 customer that has no intention of running OpenSolaris will love having the choice made available to them.

Okay, so those two are easier because the obstacles aren't legal obstacles, but there are undoubtedly internal IBM cultural issues that make them effectively non-starters.

So here's my final suggestion, and it's an absolutely serious one. It's also relatively easy, it clearly and immediately benefits IBM and IBM's customers -- and it even doesn't involve giving up any IP:

  • Port DTrace to AIX. Your customers want it. Apple has shown that it can be done. We'll help you do it. And you'll get to participate in the DTrace community (and therefore the OpenSolaris community) in a way that doesn't leave you feeling like you've been scalped by Scooter. Hell, you can even follow Apple's lead with Xray and innovate on top of DTrace: from talking to your customers over the years, it's clear that they love SMIT -- integrate a SMIT frontend with a DTrace backend! Your customers will love you for it, and the DTrace community will be excited to have yet another system on which that they can use DTrace.

Now, IBM may respond to these alternatives just as a toddler sometimes responds to constructive alternatives ("No! No! NO! Mine! MINE! MIIIIIIIINE!", etc). But if cooler heads prevail at Big Blue, these suggestions -- especially the last one -- will be seen as a way to constructively engage that will have clear benefits for IBM's customers (and therefore for IBM). So to IBM I say what parents have said to screaming toddlers for time immemorial: we're ready when you are.

Monday Aug 07, 2006

DTrace on Mac OS X!

From WWDC here in San Francisco: Apple has just announced support for DTrace in Leopard, the upcoming release of Mac OS X! Often (or even usually?) announcements at conferences are more vapor than ware. In this case, though, Apple is being quite modest: they have done a tremendous amount of engineering work to bring DTrace to their platform (including, it turns out, implementing DTrace's FBT provider for PowerPC!), and they are using DTrace as part of the foundation for their new Xray performance tool. This is very exciting news, as it brings DTrace to a whole slew of new users. (And speaking personally, it will be a relief to finally have DTrace on the only computer under my roof that doesn't run Solaris!) Having laid hands on DTrace on Mac OS X myself just a few hours ago, I can tell you that while it's not yet a complete port, it's certainly enough to be uniquely useful -- and it was quite a thrill to see Objective C frames in a ustack action! So kudos to the Apple engineering team working on the port: Steve Peters, James McIlree, Terry Lambert, Tom Duffy and Sean Callanan.

It's been fun for us to work with the Apple team, and gratifying to see their results. And it isn't just rewarding for us; the entire OpenSolaris community should feel proud about this development because it gives the lie to IBM's nauseating assertion that we in the OpenSolaris community aren't in the "spirit" of open source.

So to Apple users: welcome to DTrace! (And to DTrace users: welcome to Mac OS X!) Be sure to join us in the DTrace community -- and if you're at the WWDC, we on Team DTrace will be on hand on Friday at the DTrace session, so swing by to see a demo of DTrace running on MacOS and to meet both the team at Apple that worked on the port and we at Sun who developed DTrace. And with a little luck, you might even hear Adam commemorate the occasion by singing his beautiful and enchanting ISA aria...

Update: For disbelievers, Adam posted photos -- and Mike went into greater detail on the state of the Leopard DTrace port, and what it might mean for the direction of DTrace.

Thursday May 25, 2006

DTrace on FreeBSD, update

A while ago, I blogged about the possiblity of a FreeBSD port of DTrace. For the past few months, John Birrell has been hard at work on the port, and has announced recently that he has much of the core DTrace functionality working. Over here at DTrace Central, we've been excitedly watching John's progress for a little while, providing help and guidance where we can -- albeit not always solicited ;) -- and have been very impressed with how far he's come. And while John has quite a bit further to go before one could call it a complete port, what he has now is indisputably useful. If you run FreeBSD in production, you're going to want John's port as it stands today -- and if you develop for the FreeBSD kernel (drivers or otherwise), you're going to need it. (Once you've done kernel development with DTrace, there's no going back.)

So this is obviously a win for FreeBSD users, who can now benefit from the leap in software observability that DTrace provides. It's also clearly a win for DTrace users, because now you have another platform on which you can observe your production software -- and a larger community with whom to share your experiences and thoughts. And finally, it's a huge win for OpenSolaris users: the presence of a FreeBSD port of DTrace validates that OpenSolaris is an open, innovative platform (despite what some buttheads say) -- one that will benefit from and contribute to the undeniable economics of open source.

So congrats to John! And to the FreeBSD folks: welcome to the DTrace community!

Monday May 01, 2006

DTrace on Rails

First I need to apologize for having been absent for so long -- I am very much heads-down on a new project. (The details of which will need to wait for another day, I'm afraid -- but suffice it to say that it, like just about everything else I've done at Sun, leverages much of what I've done before it.)

That said, I wanted to briefly emerge to discuss some recent work. A while ago, I blogged about DTrace and Ruby, using Rich Lowe's prototype DTrace provider. This provider represents a quantum leap for Ruby observability, but it suffers from the fact that we must do work (in particular, we must get the class and method) even when disabled. This is undesirable (especially considering that the effect can be quite significant -- up to 2X), and it runs against the DTrace ethos of zero disabled probe effect, but there has been no better solution. Now, however, thanks to Adam's work on is-enabled probes, we can have a Ruby provider that has zero disabled probe effect. (Or essentially zero: I actually measured the probe effect at 0.2% -- very much in the noise.) Having zero disabled probe effect allows us to deploy DTrace on Ruby in production -- which in turn opens up a whole new domain for DTrace: Ruby on Rails. And as I was reminded by Jason Hoffman's recent Scale with Rails presentation (in which he outlines why they picked Solaris generally -- and ZFS in particular), this is a hugely important growth area for Solaris. So without further ado, here is a (reasonably) simple script that relies on some details of WEBrick and Rails to yield a system profile for Rails requests:

#pragma D option quiet

self string uri;

/execname == "ruby" && self->uri == NULL/
        self->fd = arg0;
        self->buf = arg1;
        self->size = arg2;

/self->uri == NULL && self->buf != NULL && strstr(this->str =
    copyinstr(self->buf, self->size), "GET ") == this->str/
        this->head = strtok(this->str, " ");
        self->uri = this->head != NULL ? strtok(NULL, " ") : NULL;
        self->syscalls = 0;
        self->rbcalls = 0;

/self->buf != NULL/
        self->buf = NULL;

/self->uri != NULL/
        @syscalls[probefunc] = count();

/self->uri != NULL/
        @rbclasses[this->class = copyinstr(arg0)] = count();
        this->sep = strjoin(this->class, "#");
        @rbmethods[strjoin(this->sep, copyinstr(arg1))] = count();

/self->uri != NULL/
        @queries[copyinstr(arg1)] = count();

/self->uri != NULL && arg0 == self->fd && strstr(this->str =
    copyinstr(arg1, arg2), "HTTP/1.1") == this->str/
        self->uri = NULL;

        normalize(@syscalls, ncalls);
        trunc(@syscalls, 10);
        printf("Top ten system calls per URI serviced:\\n");
        printa("  %-68s | %@d\\n", @syscalls);

        normalize(@rbclasses, ncalls);
        trunc(@rbclasses, 10);
        printf("\\nTop ten Ruby classes called per URI serviced:\\n");
        printa("  %-68s | %@d\\n", @rbclasses);

        normalize(@rbmethods, ncalls);
        trunc(@rbmethods, 10);
        printf("\\nTop ten Ruby methods called per URI serviced:\\n");
        printa("  %-68s | %@d\\n", @rbmethods);

        trunc(@queries, 10);
        printf("\\nTop ten MySQL queries:\\n");
        printa("  %-68s | %@d\\n", @queries);
Running the above while horsing around with the Depot application from Agile Web Development with Rails yields the following:
Top ten system calls per URI serviced:
  setcontext                                                           | 15
  fcntl                                                                | 16
  fstat64                                                              | 16
  open64                                                               | 21
  close                                                                | 25
  llseek                                                               | 27
  lwp_sigmask                                                          | 30
  read                                                                 | 62
  pollsys                                                              | 80
  stat64                                                               | 340

Top ten Ruby classes called per URI serviced:
  ActionController::CodeGeneration::Source                             | 89
  ActionController::CodeGeneration::CodeGenerator                      | 167
  Fixnum                                                               | 190
  Symbol                                                               | 456
  Class                                                                | 556
  Hash                                                                 | 1000
  String                                                               | 1322
  Array                                                                | 1903
  Object                                                               | 2364
  Module                                                               | 6525

Top ten Ruby methods called per URI serviced:
  Object#dup                                                           | 235
  String#==                                                            | 250
  Object#is_a?                                                         | 288
  Object#nil?                                                          | 316
  Hash#[]                                                              | 351
  Symbol#to_s                                                          | 368
  Object#send                                                          | 593
  Module#included_modules                                              | 1043
  Array#include?                                                       | 1127
  Module#==                                                            | 5058

Top ten MySQL queries:
  SELECT \* FROM products  LIMIT 0, 10                                  | 2
  SELECT \* FROM products WHERE ( = '7')  LIMIT 1            | 2
  SELECT count(\*) AS count_all FROM products                           | 2
  SHOW FIELDS FROM products                                            | 5

While this gives us lots of questions we might want to answer (e.g., "why the hell are we doing 340 stats on every 'effing request?!"1), it might be a little easier to look at a view that lets us see requests and the database queries that they induce. Here, for example, is a similar script to do just that:
#pragma D option quiet

self string uri;
self string newuri;

        start = timestamp;

/execname == "ruby" && self->uri == NULL/
        self->fd = arg0;
        self->buf = arg1;
        self->size = arg2;

/self->uri == NULL && self->buf != NULL && (strstr(this->str =
    copyinstr(self->buf, self->size), "GET ") == this->str ||
    strstr(this->str, "POST ") == this->str)/
        this->head = strtok(this->str, " ");
        self->newuri = this->head != NULL ? strtok(NULL, " ") : NULL;

/self->newuri != NULL/
        self->uri = self->newuri;
        self->newuri = NULL;
        printf("%3d.%03d => %s\\n", (timestamp - start) / 1000000000,
            ((timestamp - start) / 1000000) % 1000,

/self->uri != NULL/
        printf("%3d.%03d   -> \\"%s\\"\\n", (timestamp - start) / 1000000000,
            ((timestamp - start) / 1000000) % 1000,
            copyinstr(self->query = arg1));

/self->query != NULL/
        printf("%3d.%03d   <- \\"%s\\"\\n", (timestamp - start) / 1000000000,
            ((timestamp - start) / 1000000) % 1000,
        self->query = NULL;

/self->buf != NULL/
        self->buf = NULL;

/self->uri != NULL && arg0 == self->fd && strstr(this->str =
    copyinstr(arg1, arg2), "HTTP/1.1") == this->str/
        printf("%3d.%03d <= %s\\n", (timestamp - start) / 1000000000,
            ((timestamp - start) / 1000000) % 1000,
        self->uri = NULL;
Running the above while clicking around with the Depot app:
# ./rsnoop.d `pgrep ruby`
  7.936 => /admin/edit/7
  7.955   -> "SELECT \* FROM products WHERE ( = '7')  LIMIT 1"
  7.956   <- "SELECT \* FROM products WHERE ( = '7')  LIMIT 1"
  7.957   -> "SHOW FIELDS FROM products"
  7.957   <- "SHOW FIELDS FROM products"
  7.971 <= /admin/edit/7
 20.881 => /admin/update/7
 20.952   -> "SELECT \* FROM products WHERE ( = '7')  LIMIT 1"
 20.953   <- "SELECT \* FROM products WHERE ( = '7')  LIMIT 1"
 20.953   -> "SHOW FIELDS FROM products"
 20.953   <- "SHOW FIELDS FROM products"
 20.954   -> "BEGIN"
 20.954   <- "BEGIN"
 20.955   -> "UPDATE products SET `title` = 'foo bar', `price` = 1.2, ...
 20.955   <- "UPDATE products SET `title` = 'foo bar', `price` = 1.2, ...
 20.989   -> "COMMIT"
 20.989   <- "COMMIT"
 21.001 <= /admin/update/7
 21.005 => /admin/show/7
 21.023   -> "SELECT \* FROM products WHERE ( = '7')  LIMIT 1"
 21.023   <- "SELECT \* FROM products WHERE ( = '7')  LIMIT 1"
 21.024   -> "SHOW FIELDS FROM products"
 21.024   <- "SHOW FIELDS FROM products"
 21.038 <= /admin/show/7
I'm no Rails developer, but it seems like this might be useful... If you want to check this out for yourself, start by getting Rich's prototype provider. (Using it, you can do everything I've done here, just with higher disabled probe effect.) Meanwhile, I'll work with Rich to get the lower disabled probe effect version out shortly. Happy Railing!
1 Or if you're as savvy as the commenters on this blog entry, you might be saying to yourself, "why the hell is the 'effing development version running in production?!"

Tuesday Dec 13, 2005

DTrace for Linux

There are two ways to get DTrace for another operating system: you can try porting DTrace to the other system, or you can -- as Adam Leventhal describes -- use the new BrandZ framework to get that other system running under Solaris. Adam describes applying DTrace to a Linux binary -- top -- and using DTrace to find a (pretty serious) Linux-specific performance problem. Pretty eff'in cool...

Wednesday Nov 16, 2005

Welcome to ZFS!

If you haven't already seen it, ZFS is now available for download, marking a major milestone in the history of filesystems. Today is a way station down a long road: for as long as I have known Jeff Bonwick, he has wanted to solve the filesystem problem -- and about five years ago, Jeff set out to do just that, starting (as Jeff is wont to do) from a blank sheet of paper. I vividly remember Jeff describing some of his nascent ideas on my whiteboard; the ideas were radical and revolutionary, their implications manifold. I remember thinking "he's either missed something basic that somehow invalidates these ideas -- or this is the most important development in storage since RAID." As I recently recounted, Jeff is the reason that I came to Sun almost a decade ago -- and in particular, I was drawn by Jeff's contagious belief that nothing is impossible simply because it hasn't been done before. So I knew better than to doubt him at the time -- and I knew that the road ahead promised excitement if nothing else. Years after that moment, there is no other conclusion left to be had: ZFS is the most important revolution in storage software in two decades -- and may be the most important idea since the filesystem itself. That may seem a heady claim, but keep reading...

To get an idea of what ZFS can do, first check out Dan Price's awesome ZFS flash demo Then join me on a tour of today's ZFS blog entries, as ZFS developers and users inside Sun illustrate the power of ZFS: ease of administration, absolute reliability and rippin' performance.

  • Administration. If you're an administrator, start your ZFS blog tour with Martin Englund's entry on ZFS from a sysadmin's view. Martin walks you through the ease of setting up ZFS; there are no hidden wires -- it really is that easy! And if, as a self-defence mechanism, your brain refuses to let you recall the slog of traditional volume management, check out Tim Foster's entry comparing ZFS management to Veritas management. (And have we mentioned the price?) For insight into the design principles that guided the development of the administration tools, check out Eric Schrock's entry on the principles of the ZFS CLI. Eric's entry and his design reflect the principles that we used in DTrace as well: make it simple to do simple things and make it possible to do complicated things. As you can imagine, this simplicity of management is winning fans both inside and outside of Sun. For some testimonials, check out Lin Ling's entry on the love for ZFS -- both Lin's and our Beta customers'. As Lin's entry implies, a common theme among ZFS users is "if I only had this when..."; check out James McPhearson's entry wishing he had ZFS back in the day.

    And if you think that the management of ZFS couldn't get any easier, check out Steve Talley's entry on managing ZFS from your browser. Steve's work highlights the proper role for GUI admin tools in a system: they should make something that's already simple even simpler. They should not be used to smear lipstick over a hideously over-complicated system -- doing so leads to an unresolvable rift between what the tool is telling you the system looks like, and what the system actually looks like. Thanks to the simplicity of ZFS itself, there is no second guessing about what the GUI is actually doing under the hood -- it's all just gravy!

    Speaking of gravy, check out the confluence of ZFS with another revolutionary Solaris technology in Dan Price's entry on ZFS and Zones -- thanks to some great integration work, local zone administrators can have the full power of ZFS without compromising the security of the system!

    For details on particular features of ZFS, check out Mark Maybee's entry on quotas and reservations in ZFS. Unlike some other systems, quotas and reservations are first-class citizens in ZFS, not bolted-on afterthoughts. Die, /usr/sbin/quota, die! And for details on another feature of ZFS, check out Mark Shellenbaum's entry on access control lists in ZFS, and Lisa Week's entry describing why ZFS adopted the NFSv4 ACL model. Like quotas and reservations, ACLs were a part of the design of ZFS -- not something that was carpet-bombed over the source after the fact.

  • Reliability. Unlike virtually every other filesystem that has come before it, ZFS is designed around unreliable hardware. This design-center means that ZFS can detect -- and correct! -- errors that other filesystems just silently propagate to the user. To get a visceral feel for this, read Eric Lowe's entry on ZFS saving the day. Reading this entry will send a chill up your spine: Eric had a data-corrupting hardware problem that he didn't know he had until ZFS. How much data is being corrupted out there today because pre-ZFS filesystems are too trusting of faulty hardware? More to the point, how much of your data is being corrupted today? Yeah -- scary, ain't it? And not only can ZFS detect hardware errors, in a mirrored configuration it can correct them. Fortunately, you don't have to have busted hardware to see this: look at Tim Cook's entry demonstrating ZFS's self-healing by using dd to simulate date corruption.

    But if problems like Eric's are all over the place, how is anyone's data ever correct? The answer is pretty simple, if expensive: you pay for reliability by buying over-priced hardware. That is, we've compensated for dumb software by having smart (and expensive) hardware. ZFS flips the economics on its head: smart software allows for stupid (and cheap) hardware -- with ultra-high reliability. This is a profound shift; for more details on it check out Richard Elling's entry on the reliability of ZFS.

    ZFS is reliable by its architecture, but what of the implementation? As Bill Moore writes, testing ZFS was every bit as important as writing it. And testing ZFS involved many people, as Jim Walker describes in his entry on the scope of the ZFS testing effort.

  • Performance. So fine: ZFS is a snap to administer, and it's ultra-reliable -- but at what performance cost? The short answer is: none, really -- and in fact, on many workloads, it rips. How can you have such features and still have great performance? Generally speaking, ZFS is able to deliver great performance because it has more context, a phenomenon that Bill Sommerfeld notes is a consequence of the end-to-end principle. To see how this unlocks performance, look at Bill Moore's entry on I/O scheduling; as Bill describes (and as I can personally attest to) ZFS is much smarter about how it uses I/O devices than previous filesystems. For another architectural feature for performance, look at Neil Perrin's entry on the ZFS intent log -- and chase it with Neelakanth Nadgir's entry taking you through the ZIL code.

    If you're looking for some performance numbers, check out Roch Bourbonnais's entry comparing the performance of ZFS and UFS. Or let Eric Kustarz take you to school, as you go to Filesystems Performance 101: Disk Bandwidth, Filesystems Performance 102: Filesystem Bandwidth and finally graduate to Filesystems Performance 201: When ZFS Attacks!

So given that ZFS is all that, when can we start forgetting about every other on-disk filesystem? For that, we'll need to be able to boot off ZFS. Bad news: this is hard. Good news: Tabriz Leman and the rest of the ZFS Boot team are making great progress, as Tabriz describes in her entry on booting ZFS. Once we can boot ZFS -- that is, once we can assume ZFS -- all sorts of cool things become possible, as Bart Smaalders brainstorms in his entry on the impact of ZFS on Solaris. As Bart says, this is just the beginning of the ZFS revolution...

Finally, this has been a long, hard slog for the ZFS team. Anyone who has worked through "crunch time" on a big project will see something of themselves in Noel Dellofano's entry on the final push. And any parent can empathize with Sherry Moore's entry congratulating the team -- and looking forward to having her husband once again available to help with the kids. So congratulations to everyone on the ZFS team (and your families!) -- and for everyone else, welcome to ZFS!

Technorati tags:

Monday Nov 07, 2005

Your Debian fell into my OpenSolaris!

About three months ago, I wrote about the exciting possibilities of combining Debian and OpenSolaris. To be honest, when I wrote that I assumed that such a Xanadu would be a couple of years off at least -- combining these systems is non-trivial. You can thus imagine my surprise last week when I first heard of the Nexenta project. If you haven't heard, this project is doing exactly what Jeff, Adam, Mike and I were wishing for: Debian package management on an OpenSolaris kernel. Of course, hearing about a new technology is one thing; seeing it is quite another -- I wanted to get my hands on actual bits before I got too excited. Well, today Nexenta released their ISOs. And, to make a pleasantly short story even shorter, I am writing this now running Nexenta on my Acer Ferrari 3400. But don't take my word for it; check out this screenshot -- and note in particular my use of DTrace to examine the package management system. Oh be still, my beating heart!

Tuesday Sep 13, 2005

Man, myth, legend

On a Sunday night shortly after we bought Kealia, I got a call at home from John Fowler. He asked me if I'd like to join Glenn Weinberg and him the next morning to meet with Andy Bechtolsheim at Kealia's offices in Palo Alto. It's hard to express my excitement at this proposition -- it was like asking a kid if he wants to go throw a ball around with Joe DiMaggio. And indeed, my response was something akin to the "Gee mister -- that'd be swell!" that this image evokes...

When we walked in to Kealia's offices the next morning, there, in the foyer, was Andy! Andy blinked for a moment, and then -- without any introductions -- began to excitedly describe some of his machines. Still talking, he marched to his office, whereupon he went to the whiteboard and started furiously drawing block diagrams. Here, at long last, was the Real Deal: a fabled engineer who didn't disappoint -- a giant who dwarfed the substantial legend that proceeded him. After several minutes at the whiteboard, Andy got so excited that he had to actually get the plans to show us how some particular piece had been engineered. And with that, he flew out of the room.

As we caught our breath, Glenn looked at me and said "just so you know, this is what it's like trying to talk to you." While I was still trying to figure out if this was a compliment or an insult (which I still haven't figured out, by the way), Andy flew back in, unfurled some plans for a machine and excitedly pointed out some of the finer details of his design. Andy went on for a few more minutes when, like a raging prairie fire that had suddenly hit a fireline, he went quiet. With that, I kicked the door down (metaphorically) and started describing what we had been working on in Solaris 10. (After all, John hadn't brought me along to just sit back and watch.) As I went through revolutionary feature after revolutionary feature, I was astounded by how quickly Andy grasped detail -- he asked incisive questions that reflected a greater understanding of software than any other hardware engineer I had ever encountered. And as he seemed to be absorbing detail faster and faster, I began delivering it faster and faster. Now, as others have observed, I'm not exactly a slow talker; this might have been one of the few times in my life where I thought I actually needed to speak faster to stay in front of my audience. Whew! Most impressive of all, Andy had a keen intuition for the system -- he immediately saw how his innovative hardware and our innovative software could combine to deliver some uniquely innovative systems to our customers. He was excited about our software; we were excited about his hardware. How much better than that does it get?

Needless to say, ever since that morning -- which was nearly a year and a half ago now -- I have eagerly awaited the day that Andy's boxes would ship. If you've talked to me over the last year, you know that I've been very bullish on Sun; now you know why. (Well, now you have a taste as to why; believe it or not, Andy's best boxes are still to come.) Not everyone can own a car designed by Enzo Ferrari or a lamp crafted by Louis Comfort Tiffany -- but at just over two grand a pop, pretty much everyone can own a machine designed by the greatest single-board computer designer in history. Congratulations Andy and team on an historic launch! And might I add that it was especially fitting that it was welcomed with what is easily the funniest ad in Sun's history.

Friday Sep 09, 2005


So MIT's Technology Review has named me as one of their TR35 -- the top 35 innovators under the age of thirty-five. It's a great honor, especially because the other honorees are actually working on things like cures for cancer and rocket science -- domains that I have known only as rhetorical flourish. Should you like to hear me make a jackass out of myself on the subject, you might want to check out Richard Giles's latest I/O podcast, in which he interviewed me about the award.

Sunday Aug 21, 2005

DTrace and Ruby

It's been an exciting few weeks for DTrace. The party got started with Wez Furlong's new PHP DTrace provider at OSCON. Then Devon O'Dell announced that he was starting to work in earnest on a DTrace port to FreeBSD. And now, Rich Lowe has made available a prototype Ruby DTrace provider. To install this, grab Ruby 1.8.2, apply Rich's patch, and run ./configure with the --enable-dtrace option. When you run the resulting ruby, you'll see two probes: function-entry and function-return. The arguments to these probes are as follows:
  • arg0 is the name of the class (a pointer to a string within Ruby)

  • arg1 is the name of the method (also a pointer to a string within Ruby)

  • arg2 is the name of the file containing the call site (again, a pointer to a string within Ruby)

  • arg3 is the line number of the call site.

So if, for example, you'd like to know the classes and methods that are called in a particular Ruby script, you could do it with this simple D script:

#pragma D option quiet

        @[copyinstr(arg0), copyinstr(arg1)] = count();

        printf("%15s %30s   %s\\n", "CLASS", "METHOD", "COUNT");
        printa("%15s %30s   %@d\\n", @);

To run this against the cal.rb that ships in the sample directory of Ruby, call the above script whatmethods.d and run it this way:

# dtrace -s ./whatmethods.d -c "../ruby ./cal.rb"
    August 2005
 S  M Tu  W Th  F  S
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

          CLASS                         METHOD   COUNT
          Array                             <<   1
          Array                        compact   1
          Array                        reverse   1
          Array                          shift   1
          Array                           size   1
          Array                        unshift   1
          Array                      values_at   1
            Cal                          group   1
            Cal                     initialize   1
            Cal                        monthly   1
            Cal                          opt_c   1
            Cal                          opt_j   1
            Cal                          opt_m   1
            Cal                          opt_t   1
            Cal                          opt_y   1
            Cal                           pict   1
            Cal                     set_params   1
            Cal                        unlines   1
          Class                            now   1
          Class                   valid_civil?   1
          Class                    valid_date?   1
           Date                              -   1
             IO                          write   1
         Module                append_features   1
         Module                        include   1
         Module                       included   1
         Module                   undef_method   1
         Object                         detect   1
         Object                           nil?   1
         Object                          print   1
         Object     singleton_method_undefined   1
         String                             ==   1
         String                         center   1
         String                         empty?   1
         String                           scan   1
         String                          split   1
           Time                     initialize   1
           Time                           to_a   1
          Array                             ==   2
          Class                      jd_to_ajd   2
         Fixnum                             >=   2
           Hash                           each   2
           Hash                           keys   2
         Module                           attr   2
         Module               method_undefined   2
         Module                         public   2
       Rational                         coerce   2
          Array                              +   3
          Class                    civil_to_jd   3
           Hash                             []   3
         Object                        collect   3
          Array                        collect   4
          Class                      inherited   4
          Range                           each   4
         String                           size   4
         Module           private_class_method   5
         Object                           eval   5
         Object                        require   5
         String                           gsub   5
          Class                     jd_to_wday   7
          Class                           once   7
           Date                       __8713__   7
           Date                           wday   7
         Fixnum                              %   8
          Array                           join   10
           Hash                            []=   10
         String                              +   10
          Array                           each   11
       NilClass                           to_s   11
         Module                   alias_method   22
         Module                        private   22
         Symbol                           to_s   26
         Module                    module_eval   28
           Date                           mday   31
         Object                           send   31
           Date                            mon   42
           Date                      __11105__   43
          Class                    jd_to_civil   45
           Date                           succ   47
          Class                            os?   48
           Date                              +   49
         Fixnum                             <=   49
         String                          rjust   49
          Class                      ajd_to_jd   50
          Class                        clfloor   50
           Date                      __10417__   50
        Integer                           to_i   50
         Object                        Integer   50
       Rational                         divmod   50
       Rational                           to_i   50
           Date                            <=>   51
           Date                            ajd   51
           Date                             jd   51
       Rational                            <=>   51
          Class                           new0   52
           Date                     initialize   52
        Integer                           to_r   52
         Object         singleton_method_added   67
           Date                          civil   75
         Symbol                           to_i   91
          Float                              \*   96
          Float                         coerce   96
         Fixnum                              /   97
         Object                        frozen?   100
       Rational                              -   104
         Fixnum                           to_s   123
          Array                             []   141
         Object                          class   150
         Module                   method_added   154
          Float                              /   186
         Module                            ===   200
       Rational                              /   204
       Rational                              +   248
          Float                          floor   282
         Fixnum                             <<   306
          Class                         reduce   356
        Integer                            gcd   356
         Object                       Rational   356
         Fixnum                              +   414
          Class                           new!   610
       Rational                     initialize   610
          Class                            new   612
         Fixnum                            abs   712
         Fixnum                            div   762
         Fixnum                              \*   1046
         Fixnum                              <   1069
         Fixnum                              >   1970
         Fixnum                              -   2398
         Object                       kind_of?   2439
         Fixnum                             >>   4698
         Fixnum                             []   7689
         Fixnum                             ==   11436

This may leave us with many questions. For example, there are a couple of calls to construct new objects -- where are they coming from? To answer that question:

#pragma D option quiet

/copyinstr(arg1) == "initialize"/
        @[copyinstr(arg0), copyinstr(arg2), arg3] = count();

        printf("%-10s %-40s %-10s %s\\n", "CLASS",
        printa("%-10s %-40s %-10d %@d\\n", @);
Calling the above whereinit.d, we can run it in a similar manner:
# dtrace -s ./whereinit.d -c "../ruby ./cal.rb"
    August 2005
 S  M Tu  W Th  F  S
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

CLASS      INITIALIZED IN FILE                      AT LINE    COUNT
Cal        ./cal.rb                                 144        1
Date       /usr/local/lib/ruby/1.8/date.rb          593        1
Date       /usr/local/lib/ruby/1.8/date.rb          703        1
Date       /usr/local/lib/ruby/1.8/date.rb          916        1
Time       /usr/local/lib/ruby/1.8/date.rb          702        1
Date       /usr/local/lib/ruby/1.8/date.rb          901        49
Rational   /usr/local/lib/ruby/1.8/rational.rb      374        610

Looking at the Date class, it's interesting to look at line 901 of file /usr/local/lib/ruby/1.8/date.rb:

   897   # If +n+ is not a Numeric, a TypeError will be thrown.  In
   898   # particular, two Dates cannot be added to each other.
   899   def + (n)
   900     case n
   901     when Numeric; return self.class.new0(@ajd + n, @of, @sg)
   902     end
   903     raise TypeError, 'expected numeric'
   904   end

This makes sense: we're initializing new Date instances in the + method for Date. And where are those coming from? It's not hard to build a script that will tell us the file and line for the call site of an arbitrary class and method:

#pragma D option quiet

/copyinstr(arg0) == $$1 && copyinstr(arg1) == $$2/
        @[copyinstr(arg2), arg3] = count();

        printf("%-40s %-10s %s\\n", "FILE", "LINE", "COUNT");
        printa("%-40s %-10d %@d\\n", @);

For this particular example (Date#+()), call the above wherecall.d and run it this way:

# dtrace -s ./wherecall.d "Date" "+" -c "../ruby ./cal.rb"
    August 2005
 S  M Tu  W Th  F  S
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

FILE                                     LINE       COUNT
./cal.rb                                 102        2
./cal.rb                                 60         6
./cal.rb                                 63         41

And looking at the indicated lines in cal.rb:

    55   def pict(y, m)
    56     d = (1..31).detect{|d| Date.valid_date?(y, m, d, @start)}
    57     fi =, m, d, @start)
    58     fi -= (fi.jd - @k + 1) % 7
    60     ve  = ( +  6).collect{|cu|
    61       %w(S M Tu W Th F S)[cu.wday]
    62     }
    63     ve += ( + 41).collect{|cu|
    64       if cu.mon == m then cu.send(@da) end.to_s
    65     }

So this is doing exactly what we would expect, given the code. Now, if we were interested in making this perform a little better, we might be interested to know the work that is being induced by Date#+(). Here's a script that reports the classes and methods called by a given class/method -- and its callees:

#pragma D option quiet

/copyinstr(arg0) == $$1 && copyinstr(arg1) == $$2/
        self->date = 1;

        @[strjoin(strjoin(copyinstr(arg0), "#"),
            copyinstr(arg1))] = count();

/copyinstr(arg0) == $$1 && copyinstr(arg1) == $$2/
        self->date = 0;

        normalize(@, ndates);
        printf("Each call to %s#%s() induced:\\n\\n", $$1, $$2);
        printa("%@8d call(s) to %s()\\n", @);

Calling the above whatcalls.d, we can answer the question about Date#+() this way:

# dtrace -s ./whatcalls.d "Date" "+" -c "../ruby ./cal.rb"
    August 2005
 S  M Tu  W Th  F  S
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

Each call to Date#+() induced:

       1 call(s) to Class#new0()
       1 call(s) to Class#reduce()
       1 call(s) to Date#+()
       1 call(s) to Date#initialize()
       1 call(s) to Fixnum#+()
       1 call(s) to Fixnum#<<()
       1 call(s) to Integer#gcd()
       1 call(s) to Module#===()
       1 call(s) to Object#Rational()
       1 call(s) to Object#class()
       2 call(s) to Class#new()
       2 call(s) to Class#new!()
       2 call(s) to Fixnum#abs()
       2 call(s) to Fixnum#div()
       2 call(s) to Rational#+()
       2 call(s) to Rational#initialize()
       3 call(s) to Fixnum#\*()
       3 call(s) to Fixnum#<()
       8 call(s) to Object#kind_of?()
      10 call(s) to Fixnum#-()
      10 call(s) to Fixnum#>()
      23 call(s) to Fixnum#>>()
      37 call(s) to Fixnum#[]()
      52 call(s) to Fixnum#==()

That's a lot of work for something that should be pretty simple! Indeed, it's counterintuitive that, say, Integer#gcd() would be called from Date#+() -- and it certainly seems suboptimal. I'll leave further exploration into this as an exercise to the reader, but suffice it to say that this has to do with the use of a rational number in the Date class -- the elimination of which would elminate most of the above calls and presumably greatly improve the performance of Date#+().

Now, Ruby aficionados may note that some of the above functionality has been available in Ruby by setting the set_trace_func function (upon which the Ruby profiler is implemented). While that's true (to a point -- the set_trace_func seems to be a pretty limited mechanism), the Ruby DTrace provider is nonetheless a great lurch forward for Ruby developers: it allows developers to use DTrace-specific constructs like aggregations and thread-local variables to hone in on a problem; it allows Ruby-related work performed lower in the stack (e.g., in the I/O subsystem, CPU dispatcher or network stack) to be connected to the Ruby code inducing it; it allows a running Ruby script to be instrumented (and reinstrumented) without stopping or restarting it; and it allows multiple, disjoint Ruby scripts to be coherently observed and understood as a single entity. More succinctly, it's just damned cool. So thanks to Rich for developing the prototype provider -- and if you're a Ruby developer, enjoy!

Tuesday Aug 16, 2005

DTrace on FreeBSD?

One of the exciting things about OpenSolaris is that it's released under a license -- the CDDL -- that allows ports of individual components to other systems. In particular, at my OSCON presentation two weeks ago, I discussed some of the expertise required to port one such component, DTrace, to another system. I'm happy to now report that Devon O'Dell has started working on a port to one such system, FreeBSD. This has been talked about before (in some cases, with braggadocio), but Devon is the first to start the work in earnest. And indeed, work it will be: DTrace isn't a simple system, and it has several dependencies on other, Solaris-specific system components. That said, it should certainly be possible, and we on Team DTrace are available to help out in any way we can. So if you're interested in working on this, you should ping Devon -- I know that he'll welcome the help. And if you have specific questions about DTrace internals (or anything, for that matter), swing by #opensolaris and join the party!

Thursday Aug 11, 2005

Ubuntu and DTrace break bread

Today, Mike, Adam and I had lunch with Jeff Waugh, who is in town for LinuxWorld. He showed us his, we showed him ours, and a great time was had by all. I think everyone agreed that a system with Debian packages, Ubuntu package management, the Solaris Service Management Facility and (of course) DTrace would be one hell of a system. And with OpenSolaris, this paradise seems tantalizingly attainable -- or does original sin prevent us from reentering the garden?



Top Tags
« April 2014