DTrace is part of this complete operating system

Earlier this week, Mr. Vaughan-Nichols at eWeek wrote a largely inaccurate and needlessly hostile article about the CDDL, and our own Andy Tucker called him on a few points. Without bothering to correct that article or respond, he went back at it again on Wednesday, this time giving air time to SCO and their blessing of the OpenSolaris program. Why Mr. McBride of SCO felt the need to give this "blessing" is unclear; Sun obviously believes it has the rights needed to make the sources to nearly all of Solaris available under whatever license(s) we choose. Without those rights, no blessing would be sufficient; with them, none is necessary. I'll chalk this up to SCO taking whatever opportunity it can to appear relevant, especially as they continue to struggle in both the marketplace and the courtroom.

Enough of that. Instead, I'd like to focus on the most obvious and significant error in this article: the assertion that

"To date, though, the only released components of OpenSolaris are programs, such as DTrace, which aren't parts of the operating system."

We don't need to be too picky about what constitutes an operating system; even the most pedantic would surely agree that a component which spans the system from user applications to the heart of the kernel is part of the operating system. Under even an extremely narrow definition, DTrace is very much a part of the Solaris operating system - and therefore also of OpenSolaris technology. Our release of DTrace includes the sources for not just the standalone program dtrace(1M), but also all of the following:

  • The userland library libdtrace(3LIB) which provides most of dtrace(1M)'s functionality
  • Three other userland programs: lockstat(1M), plockstat(1M), and intrstat(1M), which are implemented using DTrace
  • Several kernel modules: dtrace(7D), fasttrap(7D), fbt(7D), lockstat(7D), profile(7D), sdt(7D), and systrace(7D); these implement the kernel portions of DTrace
  • Code added to the kernel itself to support dtrace, such as usr/src/uts/common/os/dtrace_subr.c
  • Two additional private user libraries which provide access to Compact C Type Format (CTF) data and the proc(4) filesystem
  • Small programs demonstrating the D language and DTrace functionality
  • A variety of headers and glue

It should be apparent that this is far more complex a subsystem than just one standalone user program. In fact, the source to dtrace(1M) is a single file out of 345 we released, and constitutes only 1431 of 102,163 lines of code (about 1.4%) in this initial release. It dtrace(1M) were simply an ordinary user program, it would not require over 100,000 lines of additional code - including over 32,000 in the kernel - to make it work.

As a final example, observe this comment block from usr/src/uts/os/common/dtrace_subr.c:

 \* Making available adjustable high-resolution time in DTrace is regrettably
 \* more complicated than one might think it should be.  The problem is that
 \* the variables related to adjusted high-resolution time (hrestime,
 \* hrestime_adj and friends) are adjusted under hres_lock -- and this lock may
 \* be held when we enter probe context.  One might think that we could address
 \* this by having a single snapshot copy that is stored under a different lock
 \* from hres_tick(), using the snapshot iff hres_lock is locked in probe
 \* context.  Unfortunately, this too won't work:  because hres_lock is grabbed
 \* in more than just hres_tick() context, we could enter probe context
 \* concurrently on two different CPUs with both locks (hres_lock and the
 \* snapshot lock) held.  As this implies, the fundamental problem is that we
 \* need to have access to a snapshot of these variables that we _know_ will
 \* not be locked in probe context.  To effect this, we have two snapshots
 \* protected by two different locks, and we mandate that these snapshots are
 \* recorded in succession by a single thread calling dtrace_hres_tick().  (We
 \* assure this by calling it out of the same CY_HIGH_LEVEL cyclic that calls
 \* hres_tick().)  A single thread can't be in two places at once:  one of the
 \* snapshot locks is guaranteed to be unheld at all times.  The
 \* dtrace_gethrestime() algorithm is thus to check first one snapshot and then
 \* the other to find the unlocked snapshot.

This comment, while arcane, is clear by itself, so I will not attempt to add to it. I will only point out that if DTrace were not a part of the operating system, it would not need to concern itself with the locking rules for updates to the high-resolution system timers. Further examples of DTrace's intimate association with core features of the Solaris kernel and userland libraries can easily be found by examining the sources.

Sun's DTrace experts have written extensively about their creation [more here and here to note just two] and provided a highly detailed reference manual. While much of this material may not be in a format which is accessible to the layman, even a cursory overview of the source we are offering and the breadth and depth of publications on the topic should be sufficient to satisfy one that DTrace is very much a part of the operating system. Perhaps Mr. Vaughan-Nichols was simply unfamiliar with the offering; in that case I would invite him to download the sources and inspect them himself, and to seek the opinions of expert engineers before making further claims of this sort. DTrace is very much a part of Solaris, and while we have much more to do, releasing it as open source was no trivial step.


Post a Comment:
  • HTML Syntax: NOT allowed



« April 2014