Saturday Oct 04, 2008

Apple updates DTrace... again

Back in January, I ranted about Apple's ham-handed breakage in their DTrace port. After some injured feelings and teary embraces, Apple cleaned things up a bit, but some nagging issues remained as I wrote:

For the Apple folks: I'd argue that revealing the name of otherwise untraceable processes is no more transparent than what Activity Monitor provides — could I have that please?

It would be very un-Apple to — you know — communicate future development plans, but in 10.5.5, DTrace has seen another improvement. Previously when using DTrace to observe the system at large, iTunes and other paranoid apps would be hidden; now they're showing up on the radar:

# dtrace -n 'profile-1999{ @[execname] = count(); }'
dtrace: description 'profile-1999' matched 1 probe

  loginwindow                                                       2
  fseventsd                                                         3
  kdcmond                                                           5
  socketfilterfw                                                    5
  distnoted                                                         7
  mds                                                               8
  dtrace                                                           12
  punchin-helper                                                   12
  Dock                                                             20
  Mail                                                             25
  Terminal                                                         26
  SystemUIServer                                                   28
  Finder                                                           42
  Activity Monito                                                  49
  pmTool                                                           67
  WindowServer                                                    184
  iTunes                                                         1482
  kernel_task                                                    4030

And of course, you can use generally available probes to observe only those touchy apps with a predicate:

# dtrace -n 'syscall:::entry/execname == "iTunes"/{ @[probefunc] = count(); }'
dtrace: description 'syscall:::entry' matched 427 probes

  pwrite                                                           13
  read                                                             13
  stat64                                                           13
  open_nocancel                                                    14
  getuid                                                           22
  getdirentries                                                    26
  pread                                                            29
  stat                                                             32
  gettimeofday                                                     34
  open                                                             36
  close                                                            37
  geteuid                                                          65
  getattrlist                                                     199
  munmap                                                          328
  mmap                                                            338

Predictably, the details of iTunes are still obscured:

# dtrace -n pid42896:::entry
dtrace: error on enabled probe ID 225607 (ID 69364: pid42896:libSystem.B.dylib:pthread_mutex_unlock:entry):
  invalid user access in action #1
dtrace: error on enabled probe ID 225546 (ID 69425: pid42896:libSystem.B.dylib:spin_lock:entry):
  invalid user access in action #1
dtrace: 1005103 drops on CPU 1

... which is fine by me; I've got code of my own I should be investigating. While I'm loath to point it out, an astute reader and savvy DTrace user will note that Apple may have left the door open an inch wider than they had anticipated. Anyone care to post some D code that makes use of that inch? I'll post an update as a comment in a week or two if no one sees it.

Update: There were some good ideas in the comments. Here's the start of a script that can let you follow the flow of control of a thread in an "untraceable" process:

#!/usr/sbin/dtrace -s

#pragma D option quiet

	trace("this program is already traceable\\n");

/self->level < 0 || self->level > 40/
	self->level = 0;

	this->p = ((dtrace_state_t \*)arg0)->dts_ecbs[arg1 - 1]->dte_probe;
	this->mod = this->p->dtpr_mod;
	this->func = this->p->dtpr_func;
	this->entry = ("entry" == stringof(this->p->dtpr_name));

	printf("%\*s-> %s:%s\\n", self->level \* 2, "",
	    stringof(this->mod), stringof(this->func));

	printf("%\*s<- %s:%s\\n", self->level \* 2, "",
	    stringof(this->mod), stringof(this->func));

Saturday Jun 07, 2008

Apple updates DTrace

Back in January, I posted about a problem with Apple's port of DTrace to Mac OS X. The heart of the issue is that their port would silently drop data such that certain experiments would be quietly invalid. Unfortunately, most reactions seized on a headline paraphrasing a line of the post — albeit with the critical negation omitted (the subject and language were, perhaps, too baroque to expect the press to read every excruciating word). The good news is that Apple has (quietly) fixed the problem in Mac OS X 10.5.3.

One issue was that timer based probes wouldn't fire if certain applications were actively executing (e.g. iTunes). This was evident both by counting periodic probe firings, and by the absence of certain applications when profiling. Apple chose to solve this problem by allowing the probes to fire while denying any inspection of untraceable processes (and generating a verbose error in that case). This script which should count 1000 firings per virtual CPU gave sporadic results on earlier revisions of Mac OS X 10.5:

	@ = count();


On 10.5.3, the output is exactly what one would expect on a 2-core CPU (1,000 executions per core):

  1  22697                         :tick-1s 

  1  22697                         :tick-1s 

On previous revisions, profiling to see what applications were spending the most time on CPU would silently omit certain applications. Now, while we can't actually peer into those apps, we can infer the presence of stealthy apps when we encounter an error:

	@[execname] = count();

	@["=stealth app="] = count();

Running this DTrace script will generate a lot of errors as we try to evaluate the execname variable for secret applications, but at the end we'll end up with a table like this:

  Adium                                                             1
  GrowlHelperApp                                                    1
  iCal                                                              1
  kdcmond                                                           1
  loginwindow                                                       1
  Mail                                                              2
  Activity Monito                                                   3
  ntpd                                                              3
  pmTool                                                            6
  mlb-nexdef-auto                                                  12
  Terminal                                                         14
  =stealth app=                                                    29
  WindowServer                                                     34
  kernel_task                                                     307
  Safari                                                          571

A big thank you to Apple for making progress on this issue; the situation is now much improved and considerably more palatable. That said, there are a couple of problems. The first is squarely the fault of team DTrace: we should probably have a mode where errors aren't printed particularly if the script is already handling them explicitly using an ERROR probe as in the script above. For the Apple folks: I'd argue that revealing the name of otherwise untraceable processes is no more transparent than what Activity Monitor provides — could I have that please? Also, I'm not sure if this has always been true, but the ustack() action doesn't seem to work from the profile action so simple profiling scripts like this one produce a bunch of errors and no output:

/execname == "Safari"/
	@[ustack()] = count();

But to reiterate: thank you thank you thank you, Steve, James, Tom, and the rest of the DTrace folks at Apple. It's great to see these issues being addressed. The whole DTrace community appreciates it.

Sunday May 04, 2008

dtrace.conf post-post-mortem

This originally was going to be a post-mortem on dtrace.conf, but so much time has passed, that I doubt it qualifies anymore. Back in March, we held the first ever DTrace (un)conference, and I hope I speak for all involved when I declare it a terrific success. And our t-shirts (logo pictured) were, frankly, bomb. Here are some fairly random impressions from the day:

Notes on the demographics at dtrace.conf: Macs were the most prevalent laptops by quite a wide margin, and a ton of demos were done under VMware for the Mac. There were a handful of dvorak users who far outnumbered the Esperanto speakers (there were none) despite apparently similarly rationales. There were, by a wide margin, more live demonstrations that I'd seen during a day of technical talks; there were probably fewer individual slides than demos -- exactly what we had in mind.

My favorite session brought the authors of the three DTrace ports to the front of the room to talk about porting, and answer questions (mostly from the DTrace team). I was excited that they agreed to work together on a wiki and on a DTrace porting project. Both would be great for new ports and for building a repository that could integrate all the ports into a single repository. I just have to see if I can get them to follow through now several weeks removed from the DTrace love-in...

Also particularly interesting were a demonstration of a DTrace-enabled Adobe Air prototype and the very clever mechanism behind the Java group's plan for native Java static probes (JSDT). Essentially, they're using the same technique as normal USDT, but dynamically generating the tracing description structures and sending them down to the kernel (slick).

The most interesting discussion resulted from Keith's presentation of vprobes -- a DTrace... um... inspired facility in VMware. While it is necessary to place a unified tracing mechanism at the lowest level of software abstraction (in DTrace's case, the kernel), it may also make sense to embed collaborating tracing frameworks at other levels of the stack. For example, the JVM could include a micro-DTrace which communicated with DTrace in the kernel as needed. This would both improve enabled performance (not a primary focus of DTrace), and allow for better domain-specific instrumentation and expression. I'll be interested to see how vprobes executes on this idea.

Requests from the DTrace community:

  • more providers ala the recent nfs and proposed ip providers
  • consistency between providers (kudos to those sending their providers to the DTrace discussion list for review)
  • better compatibility with the ports -- several people observed that while they love the port to Leopard, Apple's spurious exclusion of the -G option created tricky conflicts

Ben was kind enough to video the entire day. We should have the footage publicly available in about a week. Thanks to all who participated; several recent projects have already gotten me excited for dtrace.conf(09).

Thursday Apr 10, 2008

DTrace and JavaOne: The End of the Beginning

It was a good run, but Jarod and I didn't make the cut for JavaOne this year...


In 2005, Jarod came up with what he described as a jacked up way to use DTrace to get inside Java. This became the basis of the Java provider (first dvm for the 1.4.2 and 1.5 JVMs and now the hotspot provider for Java 6). That year, I got to stand up on stage at the keynote with John Loiacono and present DTrace for Java for the first time (to 10,000 people -- I was nervous). John was then the EVP of software at Sun. Shortly after that, he parlayed our keynote success into a sweet gig at Adobe (I was considered for the job, but ultimately rejected, they said, because their door frames couldn't accommodate my fro -- legal action is pending).

That year we also started the DTrace challenge. The premise was that if we chained up Jarod in the exhibition hall, developers could bring him their applications and he could use DTrace to find a performance win -- or he'd fork over a free iPod. In three years Jarod has given out one iPod and that one deserves a Bondsian asterisk.

After the excitement of the keynote, and the frenetic pace of the exhibition hall (and a haircut), Jarod and I anticipated at least fair interest in our talk, but we expected the numbers to be down a bit because we were presenting in the afternoon on the last day of the conference. We got to the room 15 minutes early to set up, skirting what we thought must have been the line for lunch, or free beer, or something, but turned out to be the line for our talk. Damn. It turns out that in addition to the 1,000 in the room, there was an overflow room with another 500-1,000 people. That first DTrace for Java talk had only the most basic features like tracing method entry and return, memory allocation, and Java stack backtraces -- but we already knew we were off to a good start.


No keynote, but the DTrace challenge was on again and our talk reprised its primo slot on the last day of the conference after lunch (yes, that's sarcasm). That year the Java group took the step of including DTrace support in the JVM itself. It was also possible to dynamically turn instrumentation of the JVM off and on as opposed to the start-time option of the year before. In addition to our talk, there was a DTrace hands-on lab that was quite popular and got people some DTrace experience after watching what it can do in the hands of someone like Jarod.


The DTrace talk in 2007 (again, last day of the conference after lunch) was actually one of my favorite demos I've given because I had never seen the technology we were presenting before. Shortly before JavaOne started, Lev Serebryakov from the Java group had built a way of embedding static probes in a Java program. While this isn't required to trace Java code, it does mean that developers can expose the higher level semantics of their programs to users and developers through DTrace. Jarod hacked up an example in his hotel room about 20 minutes before we presented, and amazingly it all went off without a hitch. How money is that?

JSDT -- as the Java Statically Defined Tracing is called -- is in development for the next version of the JVM, and is the next step for DTrace support of dynamic languages. Java was the first dynamic language that we first considered for use with DTrace, and it's quite a tough environment to support due to the incredible sophistication of the JVM. That support has lead the way for other dynamic languages such as Ruby, Perl, and Python which all now have built-in DTrace providers.


For DTrace and Java, this is not the end. It is not even the beginning of the end. Jarod and I are out, but Jon, Simon, Angelo, Raghavan, Amit, and others are in. At JavaOne 2008 next month there will be a talk, a BOF, and a hands-on lab about DTrace for Java and it's not even all Java: there's some php and JavaScript mixed in and both also have their own DTrace providers. I've enjoyed speaking at JavaOne these past three years, and while it's good to pass the torch, I'll miss doing it again this year. If I have the time, and can get past security I'll try to sneak into Jon and Simon's talk -- though it will be a departure from tradition for a DTrace talk to fall on a day other than the last.

Wednesday Mar 12, 2008

pid2proc for DTrace

The other day, there was an interesting post on the DTrace mailing list asking how to derive a process name from a pid. This really ought to be a built-in feature of D, but it isn't (at least not yet). I hacked up a solution to the user's problem by cribbing the algorithm from mdb's ::pid2proc function whose source code you can find here. The basic idea is that you need to look up the pid in pidhash to get a chain of struct pid that you need to walk until you find the pid in question. This in turn gives you an index into procdir which is an array of pointers to proc structures. To find out more about these structures, poke around the source code or mdb -k which is what I did.

The code isn't exactly gorgeous, but it gets the job done. It's a good example of probe-local variables (also somewhat misleadingly called clause-local variables), and demonstrates how you can use them to communicate values between clauses associated with a given probe during a given firing. You can try it out by running dtrace -c <your-command> -s <this-script>.

       this->pidp = `pidhash[$target & (`pid_hashsz - 1)];
       this->pidname = "-error-";

/\* Repeat this clause to accommodate longer hash chains. \*/
/this->pidp->pid_id != $target && this->pidp->pid_link != 0/
       this->pidp = this->pidp->pid_link;

/this->pidp->pid_id != $target && this->pidp->pid_link == 0/
       this->pidname = "-no such process-";

/this->pidp->pid_id != $target && this->pidp->pid_link != 0/
       this->pidname = "-hash chain too long-";

/this->pidp->pid_id == $target/
       /\* Workaround for bug 6465277 \*/
       this->slot = (\*(uint32_t \*)this->pidp) >> 8;

       /\* AHA! We finally have the proc_t. \*/
       this->procp = `procdir[this->slot].pe_proc;

       /\* For this example, we'll grab the process name to print. \*/
       this->pidname = this->procp->p_user.u_comm;

       printf("%d %s", $target, this->pidname);

Note that the second clause is the bit that walks the hash chain. You can repeat this clause as many times as you think will be needed to traverse the hash chain -- I really don't have any guidance here, but I imagine that a few times should suffice. Alternatively, you could construct a tick probe that steps along the hash chain to avoid a fixed limit. DTrace attempts to keep easy things easy and make difficult things possible. As evidenced by this example, possible doesn't necessarily correlate with beautiful.

Friday Jan 18, 2008

Mac OS X and the missing probes

As has been thoroughly recorded, Apple has included DTrace in Mac OS X. I've been using it as often as I have the opportunity, and it's a joy to be able to use the fruits of our labor on another operating system. But I hit a rather surprising case recently which led me to discover a serious problem with Apple's implementation.

A common trick with DTrace is to use a tick probe to report data periodically. For example, the following script reports the ten most frequently accessed files every 10 seconds:

	@[args[2]->fi_pathname] = count();

	trunc(@, 10);
	trunc(@, 0);

This was running fine, but it seemed as though sometimes (particularly with certain apps in the background) it would occasionally skip one of the ten second iterations. Odd. So I wrote the following script to see what was going on:

	@ = count();


What this will do is fire a probe at 1000hz on all (logical) CPUs. Running this on a dual-core machine we'd expect to see it print out 2000 each time. Instead I saw this:

  0  22369                         :tick-1s 

  0  22369                         :tick-1s 

  0  22369                         :tick-1s 

  0  22369                         :tick-1s 

  0  22369                         :tick-1s 

  0  22369                         :tick-1s 

  0  22369                         :tick-1s 

  0  22369                         :tick-1s 

Kind of bizarre. The missing tick-1s probes explain the values over 2000, but weirder were the values so far under 2000. To explore a bit more I performed another DTrace experiment to see what applications were running when the profile probe fired:

# dtrace -n profile-997'{ @[execname] = count(); }'
dtrace: description 'profile-997' matched 1 probe

  Finder                                                            1
  configd                                                           1
  DirectoryServic                                                   2
  GrowlHelperApp                                                    2
  llipd                                                             2
  launchd                                                           3
  mDNSResponder                                                     3
  fseventsd                                                         4
  mds                                                               4
  lsd                                                               5
  ntpd                                                              6
  kdcmond                                                           7
  SystemUIServer                                                    8
  dtrace                                                            8
  loginwindow                                                       9
  pvsnatd                                                          21
  Dock                                                             41
  Activity Monito                                                  45
  pmTool                                                           52
  Google Notifier                                                  60
  Terminal                                                        153
  WindowServer                                                    238
  Safari                                                         1361
  kernel_task                                                    4247

While there's nothing suspicious about the output in itself, it was strange because I was listening to music at the time. With iTunes. Where was iTunes?
I ran the first experiment again and caused iTunes to do more work which yielded these results:

  0  22369                         :tick-1s 

  0  22369                         :tick-1s 

  0  22369                         :tick-1s 

  0  22369                         :tick-1s 

So what was iTunes doing? To answer that I again turned to DTrace and used the following enabling to see what functions were being called most frequently by iTunes (whose process ID was 332):

# dtrace -n 'pid332:::entry{ @[probefunc] = count(); }'
dtrace: description 'pid332:::entry' matched 264630 probes

I let it run for a while, made iTunes do some work, and the result when I stopped the script? Nothing. The expensive DTrace invocation clearly caused iTunes to do a lot more work, but DTrace was giving me no output.
Which started me thinking... did they? Surely not. They wouldn't disable DTrace for certain applications.

But that's exactly what Apple's done with their DTrace implementation. The notion of true systemic tracing was a bit too egalitarian for their classist sensibilities so they added this glob of lard into dtrace_probe() -- the heart of DTrace:

#if defined(__APPLE__)
         \* If the thread on which this probe has fired belongs to a process marked P_LNOATTACH
         \* then this enabling is not permitted to observe it. Move along, nothing to see here.
        if (ISSET(current_proc()->p_lflag, P_LNOATTACH)) {
#endif /\* __APPLE__ \*/

Wow. So Apple is explicitly preventing DTrace from examining or recording data for processes which don't permit tracing. This is antithetical to the notion of systemic tracing, antithetical to the goals of DTrace, and antithetical to the spirit of open source. I'm sure this was inserted under pressure from ISVs, but that makes the pill no easier to swallow. To say that Apple has crippled DTrace on Mac OS X would be a bit alarmist, but they've certainly undermined its efficacy and, in doing do, unintentionally damaged some of its most basic functionality. To users of Mac OS X and of DTrace: Apple has done a service by porting DTrace, but let's convince them to go one step further and port it properly.

Saturday Oct 27, 2007


It's been more than a year since I first saw DTrace on Mac OS X, and now it's at last generally available to the public. Not only did Apple port DTrace, but they've also included a bunch of USDT providers. Perl, Python, Ruby -- they all ship in Leopard with built-in DTrace probes that allow developers to observe function calls, object allocation, and other points of interest from the perspective of that dynamic language. Apple did make some odd choices (e.g. no Java provider, spurious modifications to the publicly available providers, a different build process), but on the whole it's very impressive.

Perhaps it was too much to hope for, but with Apple's obvious affection for DTrace I thought they might include USDT probes for Safari. Specifically, probes in the JavaScript interpreter would empower developers in the same way they enabled Ruby, Perl, and Python developers. Fortunately, the folks at the Mozilla Foundation have already done the heavy lifting for Firefox -- it was just a matter of compiling Firefox on Mac OS X 10.5 with DTrace enabled:

There were some minor modifications I had to make to the Firefox build process to get everything working, but it wasn't too tricky. I'll try to get a patch submitted this week, and then Firefox will have the same probes on Mac OS X that it does -- thanks to Brendan's early efforts -- on Solaris. JavaScript developers take note: this is good news.

Sunday Aug 05, 2007

What-If Machine: DTrace Port

What if there were a port of DTrace to Linux?

What if there were a port of DTrace to Linux: could such a thing be done without violating either the GPL or CDDL? Read on before you jump right to the comments section to add your two cents.

In my last post, I discussed an attempt to create a DTrace knockoff in Linux, and suggested that a port might be possible. Naively, I hoped that comments would examine the heart of my argument, bemoan the apparent NIH in the Linux knockoff, regret the misappropriation of slideware, and maybe discuss some technical details -- anything but dwell on licensing issues.

For this post, I welcome the debate. Open source licenses are important, and the choice can have a profound impact on the success of the software and the community. But conversations comparing the excruciating minutia of one license and another are exhausting, and usually become pointless in a hurry. Having a concrete subject might lead to a productive conversation.

DTrace Port Details

Just for the sake of discussion, let's say that Google decide to port DTrace to Linux (everyone loves Google, right?). This isn't so far fetched: Google uses Linux internally, maybe they're using SystemTap, maybe they're not happy with it, but they definitely (probably) care about dynamic tracing (just like all good system administrators and developers should). So suppose some engineers at Google take the following (purely hypothetical) steps:

Kernel Hooks

DTrace has a little bit of functionality that lives in the core kernel. The code to deal with invalid memory accesses, some glue between the kernel's dynamic linker and some of the DTrace instrumentation providers, and some simple, low-level routines cover the bulk of it. My guess is there are about 1500 lines of code all told: not trivial, but hardly insurmountable. Google implements these facilities in a manner designed to allow the results to be licensed under the GPL. For example, I think it would be sufficient for someone to draft a specification and for someone else to implement it so long as the person implementing it hadn't seen the CDDL version. Google then posts the patch publically.

DTrace Kernel Modules

The other DTrace kernel components are divided into several loadable kernel modules. There's the main DTrace module and then the instrumentation provider modules that connect to the core framework through an internal interface. These constitute the vast majority of the in-kernel DTrace code. Google modifies these to use slightly different interfaces (e.g. mutex_enter() becomes mutex_lock()); the final result is a collection of kernel modules still licensed under the CDDL. Of course, Google posts any modifications to CDDL files.

DTrace Libraries and Commands

It wouldn't happen for free, but the DTrace user-land components could just be directly ported. I don't believe there are any legal issues here.

So let's say that this is Google's DTrace port: their own hacked up kernel, some kernel modules operating under a non-GPL license, and some user-land components (also under a non-GPL license, but, again, I don't think that matters). Now some questions:

1. Legal To Run?

If Google assembled such a system, would it be legal to run on a development desktop machine? It seems to violate the GPL no more than, say, the nVidia drivers (which are presumably also running on that same desktop). What if Google installed the port on a customer-facing machine? Are there any additional legal complications there? My vote: legit.

2. Legal To Distribute?

Google distributes the Linux kernel patch (so that others can construct an identical kernel), and elsewhere they distribute the Linux-ready DTrace modules (in binary or source form): would that violate either license? It seems that it would potentially violate the GPL if a full system with both components were distributed together, but distributed individually it would certainly be fine. My vote: legit, but straying into a bit of a gray area.

3. Patch Accepted?

I'm really just putting this here for completeness. Google then submits the changes to the Linux kernel and tries to get them accepted upstream. There seems to be a precedent for the Linux kernel not accepting code that's there merely to support non-GPL kernel modules, so I doubt this would fly. My vote: not gonna happen.

4. No Source?

What if Google didn't supply the source code to either component, and didn't distribute any of it externally? My vote: legal, but morally bankrupt.

You Make The Call

So what do you think? Note that I'm not asking if it would be "good", and I'm not concluding that this would obviate the need for direct support for a native dynamic tracing framework in the Linux kernel. What I want to know is whether or not this DTrace port to Linux would be legal (and why)? If not, what would happen to poor Google (e.g. would FSF ninjas storm the Googleplex)?

If you care to comment, please include some brief statement about your legal expertise. I for one am not a lawyer, have no legal background, have read both the GPL and CDDL and have a basic understanding of both, but claim to be an authority in neither. If you don't include some information with regard to that, I may delete your comment.

Thursday Aug 02, 2007

DTrace Knockoffs

Update 8/6/2007: Those of you interested in this entry may also want to check out my next entry on the legality of a hypothetical port of DTrace to Linux.

Tools We Wish We Had -- OSCON 7/26/2007

Last week at OSCON someone set up a whiteboard with the heading "Tools We Wish We Had". People added entries (wiki-style); this one in particular caught my eye:

dtrace for Linux
or something similar

- jdub
(NO, LIKE dtrace)
(like systemtap, but not crap)


So what exactly were they asking for? DTrace is the tool developers and sysadmins have always needed -- whether they knew it or not -- but weren't able to express in words let alone code. Most simply (and least humbly) DTrace lets you express a question about nearly any aspect of the system and get the answer in a simple and concise form. And -- this is important -- you can do it safely on machines running in production as well as in development. With DTrace, you can look at the highest level software such as Ruby (as was the subject of my talk at OSCON) through all the layers of the software stack down to the lowest level kernel facilities such as I/O and scheduling. This systemic scope, production focus, and arbitrary flexibility are completely new, and provide literally unprecedented observability into complex software systems. We're scientists, we're detectives -- DTrace lets us form hypotheses, and prove or disprove them in an instant until we've come to an understanding of the problem, until we've solved the crime. Of course anyone using Linux would love a tool like that -- especially because DTrace is already available on Mac OS X, Solaris, and FreeBSD.


So is SystemTap like DTrace? To understand SystemTap, it's worth touching on the history of DTrace: Bryan cut the first code for DTrace in October of 2001; Mike tagged in moments later, and I joined up after a bit. In September of 2003 we integrated DTrace into Solaris 10 which first became available to customers in November of 2003 and formally shipped and was open-sourced in January of 2005. Almost instantly we started to see the impact in the field. In terms of performance, Solaris has strong points and weak points; with DTrace we were suddenly able to understand where those bottlenecks were on customer systems and beat out other vendors by improving our performance -- not in weeks or months, but literally in a few hours. Now, I'm not saying that DTrace was the silver bullet by which all enemies were slain -- that's clearly not the case -- but it was turning some heads and winning some deals.

Now, this bit involves some hearsay and conjecture[1], but apparently some managers of significance at Red Hat, IBM, and Intel started to take note. "We've got to do something about this DTrace," one of them undoubtedly said with a snarl (as an underling dragged off the fresh corpse of an unlucky messenger). SystemTap was a direct reaction to the results we were achieving with DTrace -- not to DTrace as an innovative technology.

When the project started in January of 2005, early discussion by the SystemTap team referred to "inspiration" that they derived from DTrace. They had a mandate to come up with an equivalent, so I assumed that they had spent the time to truly understand DTrace: to come up with an equivalent for DTrace -- or really to duplicate any technology -- the first step is to understand what it is completely. From day one, DTrace was designed to be used on mission critical systems, to always be safe, to induce no overhead when not in use, to allow for arbitrary data gathering, and to have systemic scope from the kernel to user-land and on up the stack into higher level languages. Those fundamental constraints led to some important, and non-obvious design decisions (e.g. our own language "D", a micro virtual machine, conservative probe point selection).

SystemTap -- the "Sorny" of dynamic tracing

Instead of taking the time to understand DTrace, and instead of using it and scouring the documentation, SystemTap charged ahead, completely missing the boat on safety with an architecture which is nearly impossible to secure (e.g. running a SystemTap script drops in a generated kernel module). Truly systemic scope remains an elusive goal as they're only toe-deep in user-land (forget about Ruby, Java, python, etc). And innovations in DTrace such as scalable data aggregation and speculative tracing are replicated poorly if at all. By failing to examine DTrace, and by rushing to have some sort of response, SystemTap isn't like DTrace: it's a knockoff.

Amusingly, in an apparent attempt to salvage their self-respect, the SystemTap team later renounced their inspiration. Despite frequent mentions of DTrace in their early meetings and email, it turns out, DTrace didn't actually inspire them much at all:

CVSROOT:	/cvs/systemtap
Module name:	src
Changes by:	2006-11-02 23:03:09

Modified files:
	.              : 

Log message:
	Removed refs to dtrace, to which we were giving undue credit in terms of

you're not my real dad! <slam>

Bad Artists Copy...

So uninspired was the SystemTap team by DTrace, that they don't even advocate its use according to a presentation on profiling applications ("Tools that we avoid - dtrace [sic]"). In that same presentation there's an example of a SystemTap-based tool called udpstat.stp:

$ udpstat.stp
  UDP_out  UDP_outErr  UDP_in  UDP_inErr  UDP_noPort
        0           0       0          0           0
        0           0       0          0           0
        4           0       0          0           0
        5           0       0          0           0
        5           0       0          0           0

... whose output was likely "inspired" by udpstat.d -- part of the DTraceToolkit by Brendan Gregg:

# udpstat.d
  UDP_out  UDP_outErr   UDP_in  UDP_inErr  UDP_noPort
        0           0        0          0           1
        0           0        0          0           2 
        0           0        0          0           0
     1165           0        2          0           0 

In another act of imitation reminiscent of liberal teenage borrowing from wikipedia, take a look at Eugene Teo's slides from Red Hat Summit 2007 as compared with Brendan's DTrace Topics Intro wiki (the former apparently being generated by applying a sed script to the latter). For example:

What isn’t SystemTap

  • SystemTap isn’t sentient; requires user thinking process
  • SystemTap isn’t a replacement for any existing tools

What isn't DTrace

  • DTrace isn't a replacement for kstat or SMNP
    • kstat already provides inexpensive long term monitoring.
  • DTrace isn't sentient, it needs to borrow your brain to do the thinking
  • DTrace isn't “dTrace”

... Great Artists Steal

While some have chosen the knockoff route, others have taken the time to analyze what DTrace does, understood the need, and decided that the best DTrace equivalent would be... DTrace. As with the rest of Solaris, DTrace is open source so developers and customers are excited about porting. Just a few days ago there were a couple of interesting blog posts (here and here) by users of ONTAP, NetApp's appliance OS, not for a DTrace equivalent, but for a port of DTrace itself.

DTrace is already available in the developer builds of Mac OS X 10.5, and there's a functional port for FreeBSD. I don't think it's a stretch to say that DTrace itself is becoming the measuring stick -- if not the standard. Why reinvent the wheel when you can port it?

Time For Standards

At the end of my talk last week someone asked if there was a port of DTrace to Linux (not entirely surprising since OSCON has a big Linux user contingent). I told him to ask the Linux bigwigs (several of them were also at the conference); after all, we didn't do the port to Mac OS X, and we didn't do the port to FreeBSD. We did extend our help to those developers, and they, in turn, helped DTrace by growing the community and through direct contributions[2].

We love to see DTrace on other operating systems, and we're happy to help.

So to the pretenders: enough already with the knockoffs. Your users want DTrace, you obviously want what DTrace offers, and the entire DTrace team and community are eager to help. I'm sure there's been some FUD about license incompatibilities, but it's certainly Sun's position (as stated by Sun's CEO Jonathan Schwartz at OSCON 2005) that such a port wouldn't violate the OpenSolaris license. And even closed-source kernel components are tolerated from the likes of Symantec (nee Veritas) and nVidia. Linux has been a champion of standards, eschewing proprietary solutions for free and open standards. DTrace might not yet be a standard, but a DTrace knockoff never will be.

[1] ... those are kinds of evidence
[2] including posts on the DTrace discussion forum comprehensible only to me and James

Tuesday Jul 31, 2007

DTrace for Ruby at OSCON 2007

I just got back from OSCON, a conference on Open Source that O'Reilly hosts in Portland annually. The conference offered some interesting content and side-shows with some notable highlights (more on those in the next few days). Brendan and I gave a presentation on how a crew from Sun dropped in on Twitter to help them use DTrace to discover some nasty performance problems.

Here's the presentation along with the D scripts and load generators we used for the talk.

Thursday Jul 05, 2007

DTrace "Scobleized"

Robert Scoble was kind enough to interview us last week for the ScobleShow. Robert pretty much let us riff continuously for half an hour -- we clearly haven't been getting to talk about DTrace enough lately. I thought he would trim it down a bit, but like a scene from Hard Boiled, it's all there.

This picture captures my favorite moment (around 16:23) during the interview as the three of us try to formulate a connection between DTrace and green computing...

... and this is as good a time as any to plug the talk Brendan and I will be giving at the end of the month at OSCON. We'll be talk about how team DTrace was able to solve some nasty Ruby scalability problems at Twitter; it's in the Ruby track, but the principles apply for analysis of performance problems in all languages.

Tuesday Jul 03, 2007

iSCSI DTrace provider and more to come

People often ask about the future direction of DTrace, and while we have some stuff planned for the core infrastructure, the future is really about extending DTrace's scope into every language, protocol, and application with new providers -- and this development is being done by many different members of the DTrace community. An important goal of this new work is to have consistent providers that work predictably. To that end, Brendan and I have started to sketch out an array of providers so that we can build a consistent model.

In that vein, I recently integrated a provider for our iSCSI target into Solaris Nevada (build 69, and it should be in a Solaris 10 update, but don't ask me which one). It's an USDT provider so the process ID is appended to the name; you can use \* to avoid typing the PID of the iSCSI target daemon. Here are the probes with their arguments (some of the names are obvious; for others you might need to refer to the iSCSI spec):

probe nameargs[0]args[1]args[2]
iscsi\*:::async-sendconninfo_t \*iscsiinfo_t \*-
iscsi\*:::login-commandconninfo_t \*iscsiinfo_t \*-
iscsi\*:::login-responseconninfo_t \*iscsiinfo_t \*-
iscsi\*:::logout-commandconninfo_t \*iscsiinfo_t \*-
iscsi\*:::logout-responseconninfo_t \*iscsiinfo_t \*-
iscsi\*:::data-receiveconninfo_t \*iscsiinfo_t \*-
iscsi\*:::data-requestconninfo_t \*iscsiinfo_t \*-
iscsi\*:::data-sendconninfo_t \*iscsiinfo_t \*-
iscsi\*:::nop-receiveconninfo_t \*iscsiinfo_t \*-
iscsi\*:::nop-sendconninfo_t \*iscsiinfo_t \*-
iscsi\*:::scsi-commandconninfo_t \*iscsiinfo_t \*iscsicmd_t \*
iscsi\*:::scsi-responseconninfo_t \*iscsiinfo_t \*-
iscsi\*:::task-commandconninfo_t \*iscsiinfo_t \*-
iscsi\*:::task-responseconninfo_t \*iscsiinfo_t \*-
iscsi\*:::text-commandconninfo_t \*iscsiinfo_t \*-
iscsi\*:::text-responseconninfo_t \*iscsiinfo_t \*-

The argument structures are defined as follows:

typedef struct conninfo {
        string ci_local;        /\* local host address \*/
        string ci_remote;       /\* remote host address \*/
        string ci_protocol;     /\* protocol (ipv4, ipv6, etc) \*/
} conninfo_t;

typedef struct iscsiinfo {
        string ii_target;               /\* target iqn \*/
        string ii_initiator;            /\* initiator iqn \*/
        uint64_t ii_lun;                /\* target logical unit number \*/

        uint32_t ii_itt;                /\* initiator task tag \*/
        uint32_t ii_ttt;                /\* target transfer tag \*/

        uint32_t ii_cmdsn;              /\* command sequence number \*/
        uint32_t ii_statsn;             /\* status sequence number \*/
        uint32_t ii_datasn;             /\* data sequence number \*/

        uint32_t ii_datalen;            /\* length of data payload \*/
        uint32_t ii_flags;              /\* probe-specific flags \*/
} iscsiinfo_t;

typedef struct iscsicmd {
        uint64_t ic_len;        /\* CDB length \*/
        uint8_t \*ic_cdb;        /\* CDB data \*/
} iscsicmd_t;

Note that the arguments go from most generic (the connection for the application protocol) to most specific. As an aside, we'd like future protocol providers to make use of the conninfo_t so that one could write a simple script to see a table of frequent consumers for all protocols:

        @[args[0]->ci_remote] = count();

With the iSCSI provider you can quickly see which LUNs are most active:

        @[args[1]->ii_target] = count();
or the volume of data transmitted:
        @ = sum(args[1]->ii_datalen);

Brendan has been working on a bunch of iSCSI scripts -- those are great for getting started examining iSCSI

Monday May 28, 2007

DTrace @ JavaOne 2007

This year, Jarod Jenson and I gave an updated version of our DTrace for Java (technology-based applications) talk:

The biggest new feature that we demonstrated is the forthcoming Java Statically-Defined Tracing (JSDT) which will allow developers to embed stable probes in their code as we can do today in the kernel with SDT probes and in C and C++ applications with USDT probes. While you can already trace Java applications (and C and C++ applications), static probes let the developer embed stable and semantically rich points of instrumentation that allow the user to examine the application without needing to understand its implementation. The Java version of this is so new I had literally never seen it until Jarod gave a demonstration during our talk. The basic idea is that you can define a new probe by constructing a USDTProbe instance specifying the provider, function, probe name, and argument signature:

    sun.dtrace.USDTProbe myprobe = new sun.dtrace.USDTProbe("myprovider", "myfunc", "myprobe", "ssl");

To fire the probe, you invoke the Call() method on the instance, and pass in the appropriate arguments.

Attendance was great, and we talked to a lot of people who had attended last year and had been getting mileage out of DTrace for Java. Next year, we're hoping to give the updated version of this talk on Tuesday (rather than Friday for once) and invite people to bring in their applications for a tune-up; we'll present the results in a case study-focussed talk on Friday.

Tuesday Dec 12, 2006

It's tested or it's broken

It's amazing how lousy software is. That we as a society have come to accept buggy software as an inevitability is either a testament to our collective tolerance, or -- much more likely -- the near ubiquity of crappy software. So we are guilty of accepting low standards for software, but the smaller we of software writers are guilty of setting those low expectations. And I mean we: all of us. Every programmer has at some time written buggy software (or has never written any software of any real complexity), and while we're absolutely at fault its not from lack of exertion. From time immemorial PhD candidates have scratched their whiteboard markers dry in attempts to eliminate bugs with new languages, analysis, programming techniques, and styles. The simplest method for finding bugs before they're released into the wild remains the most generally effective: testing.

Of course, programmers perform at least nominal checks before integrating new code, but there's only so much a person can test by hand. So we've invented tests suites -- collections of tests that require no interaction. Testing rigor is regarded by university computer science departments a bit like ditch-digging is by civil engineering departments: a bit pedestrian. So people tend to sort it out for themselves. Here are a couple of tips for software tests that have come out of my experience using and developing tests suites (and the DTrace test suite in particular):

It has to be easy to run

A favorite mantra of a colleague of mine is that software is only as good as its test suite. While slightly less pithy, I'd add that a test suite is only as good as one's ability to run it. At Sun we have test suites for all kinds of crazy things. Many of them require elaborate configurations, and complex installations. Even when you manage to get everything set up (or, as often as not, find someone else to get it set up) and run, comprehending the results can require a visit from the high priestess of QA to scrutinize the pigeon entrails of the output logs.

Installing and executing a test suite needs to be so simple that it can be done by any moron who might have the wherewithall to be able to modify the software it tests (hint: that's usually a lower bar than you'd like). The same goes for understanding the results. Building the DTrace test suite creates a package which you then install wherever you want to perform the testing. Running it (by executing a single command) produces output indicating how many tests passed and how many failed. A single failure represents a bug. I've used test suites where there are expected failures (things are no more broken than they were), and unexpected failures (you broke something), but differentiating the two can be nearly impossible for a novice. Keep it simple and easy to understand, or don't bother at all -- no one will run tests they can't figure out.

Complete and up-to-date

Now that people are executing the test suite because it's such a breeze, it actually needs to test the software. I think it's productive to write tests both from the perspective of the implementation and the documented behavior, but there just needs to be adequate coverage -- and the extent of the coverage is often you can test for with some accuracy. As the software is evolving, the test suite needs to evolve with it. Every enhancement or bug fix should be accompanied with new tests to verify the change, and to ensure that it's not regressed in the future. On projects I've worked on, the tests for certain features have required much more thought and effort than the feature itself, but skipping the test is absolutely unacceptable. In short: a test suite should completely test the target software at any given moment.

With the code

Originally we developed the DTrace test suite as a separate code base. This caused some unanticipated problems. Since they were in different places, we would often integrate a change to DTrace and forget about the test for a couple of days -- violating the constraint noted above. Also, projects that lagged behind the main repository would run the test suite and encounter a bunch of spurious failures because they were effectively testing out of date software. We had similar problems when back-porting new DTrace features and fixes to Solaris 10.

The solution -- in a rare split decision among the DTrace team -- was to integrate the test suite into the same repository as the code. This has absolutely been the right move. Now we can update the code and the test suite literally at the same time, and we're forced to think about testing sooner and more rigorously. It's also proved beneficial for the back-porting effort since a given snapshot of the source base contains the correct tests for that code.

Run automatically

Ideally it shouldn't be necessary, but automatically running tests is a great way to ensure that errors don't creep in because of sloppy engineering or seemingly unrelated changes. This is actually an area where DTrace is a less compelling role model. If we had put this procedure in place, it would have helped us to catch at least one bug quite a bit earlier. Solaris Nevada -- the code name for the next Solaris release -- recently changed compiler versions which resulted in a DTrace bug due to a newly aggressive optimizer on SPARC. The DTrace test suite picked this up immediately, but it wasn't run for at least a week after the compiler switch was made. We're working to have it run nightly, and our new project has been running nightly tests for a few weeks now.

Go forth and test

I've spent too many hours trying to figure out how to run arcane test suites -- just so I can't be accused of unduly contributing to the crappy state of software. I hope some of these (admittedly less-than-brilliant) lessons learned from testing DTrace have been helpfull. If you want to check out the DTrace test suite, you can see the code here and find the documentation for it here.

Technorati Tags:


Adam Leventhal, Fishworks engineer


« July 2016