Saturday Oct 04, 2008

Apple updates DTrace... again

Back in January, I ranted about Apple's ham-handed breakage in their DTrace port. After some injured feelings and teary embraces, Apple cleaned things up a bit, but some nagging issues remained as I wrote:

For the Apple folks: I'd argue that revealing the name of otherwise untraceable processes is no more transparent than what Activity Monitor provides — could I have that please?

It would be very un-Apple to — you know — communicate future development plans, but in 10.5.5, DTrace has seen another improvement. Previously when using DTrace to observe the system at large, iTunes and other paranoid apps would be hidden; now they're showing up on the radar:

# dtrace -n 'profile-1999{ @[execname] = count(); }'
dtrace: description 'profile-1999' matched 1 probe
\^C

  loginwindow                                                       2
  fseventsd                                                         3
  kdcmond                                                           5
  socketfilterfw                                                    5
  distnoted                                                         7
  mds                                                               8
  dtrace                                                           12
  punchin-helper                                                   12
  Dock                                                             20
  Mail                                                             25
  Terminal                                                         26
  SystemUIServer                                                   28
  Finder                                                           42
  Activity Monito                                                  49
  pmTool                                                           67
  WindowServer                                                    184
  iTunes                                                         1482
  kernel_task                                                    4030

And of course, you can use generally available probes to observe only those touchy apps with a predicate:

# dtrace -n 'syscall:::entry/execname == "iTunes"/{ @[probefunc] = count(); }'
dtrace: description 'syscall:::entry' matched 427 probes
\^C

...
  pwrite                                                           13
  read                                                             13
  stat64                                                           13
  open_nocancel                                                    14
  getuid                                                           22
  getdirentries                                                    26
  pread                                                            29
  stat                                                             32
  gettimeofday                                                     34
  open                                                             36
  close                                                            37
  geteuid                                                          65
  getattrlist                                                     199
  munmap                                                          328
  mmap                                                            338

Predictably, the details of iTunes are still obscured:

# dtrace -n pid42896:::entry
...
dtrace: error on enabled probe ID 225607 (ID 69364: pid42896:libSystem.B.dylib:pthread_mutex_unlock:entry):
  invalid user access in action #1
dtrace: error on enabled probe ID 225546 (ID 69425: pid42896:libSystem.B.dylib:spin_lock:entry):
  invalid user access in action #1
dtrace: 1005103 drops on CPU 1

... which is fine by me; I've got code of my own I should be investigating. While I'm loath to point it out, an astute reader and savvy DTrace user will note that Apple may have left the door open an inch wider than they had anticipated. Anyone care to post some D code that makes use of that inch? I'll post an update as a comment in a week or two if no one sees it.

Update: There were some good ideas in the comments. Here's the start of a script that can let you follow the flow of control of a thread in an "untraceable" process:

#!/usr/sbin/dtrace -s

#pragma D option quiet

pid$target:libSystem.B.dylib::entry,
pid$target:libSystem.B.dylib::return
{
	trace("this program is already traceable\\n");
	exit(0);
}

ERROR
/self->level < 0 || self->level > 40/
{
	self->level = 0;
}

ERROR
{
	this->p = ((dtrace_state_t \*)arg0)->dts_ecbs[arg1 - 1]->dte_probe;
	this->mod = this->p->dtpr_mod;
	this->func = this->p->dtpr_func;
	this->entry = ("entry" == stringof(this->p->dtpr_name));
}

ERROR
/this->entry/
{
	printf("%\*s-> %s:%s\\n", self->level \* 2, "",
	    stringof(this->mod), stringof(this->func));
	self->level++;
}

ERROR
/!this->entry/
{
	self->level--;
	printf("%\*s<- %s:%s\\n", self->level \* 2, "",
	    stringof(this->mod), stringof(this->func));
}

Friday Jan 18, 2008

Mac OS X and the missing probes

As has been thoroughly recorded, Apple has included DTrace in Mac OS X. I've been using it as often as I have the opportunity, and it's a joy to be able to use the fruits of our labor on another operating system. But I hit a rather surprising case recently which led me to discover a serious problem with Apple's implementation.

A common trick with DTrace is to use a tick probe to report data periodically. For example, the following script reports the ten most frequently accessed files every 10 seconds:

io:::start
{
	@[args[2]->fi_pathname] = count();
}

tick-10s
{
	trunc(@, 10);
	printa(@);
	trunc(@, 0);
}

This was running fine, but it seemed as though sometimes (particularly with certain apps in the background) it would occasionally skip one of the ten second iterations. Odd. So I wrote the following script to see what was going on:

profile-1000
{
	@ = count();
}

tick-1s
{
	printa(@);
	clear(@);
}

What this will do is fire a probe at 1000hz on all (logical) CPUs. Running this on a dual-core machine we'd expect to see it print out 2000 each time. Instead I saw this:

  0  22369                         :tick-1s 
             1803

  0  22369                         :tick-1s 
             1736

  0  22369                         :tick-1s 
             1641

  0  22369                         :tick-1s 
             3323

  0  22369                         :tick-1s 
             1704

  0  22369                         :tick-1s 
             1732

  0  22369                         :tick-1s 
             1697

  0  22369                         :tick-1s 
             5154

Kind of bizarre. The missing tick-1s probes explain the values over 2000, but weirder were the values so far under 2000. To explore a bit more I performed another DTrace experiment to see what applications were running when the profile probe fired:

# dtrace -n profile-997'{ @[execname] = count(); }'
dtrace: description 'profile-997' matched 1 probe
\^C

  Finder                                                            1
  configd                                                           1
  DirectoryServic                                                   2
  GrowlHelperApp                                                    2
  llipd                                                             2
  launchd                                                           3
  mDNSResponder                                                     3
  fseventsd                                                         4
  mds                                                               4
  lsd                                                               5
  ntpd                                                              6
  kdcmond                                                           7
  SystemUIServer                                                    8
  dtrace                                                            8
  loginwindow                                                       9
  pvsnatd                                                          21
  Dock                                                             41
  Activity Monito                                                  45
  pmTool                                                           52
  Google Notifier                                                  60
  Terminal                                                        153
  WindowServer                                                    238
  Safari                                                         1361
  kernel_task                                                    4247

While there's nothing suspicious about the output in itself, it was strange because I was listening to music at the time. With iTunes. Where was iTunes?
I ran the first experiment again and caused iTunes to do more work which yielded these results:

  0  22369                         :tick-1s 
             3856

  0  22369                         :tick-1s 
             1281

  0  22369                         :tick-1s 
             4770

  0  22369                         :tick-1s 
             2271

So what was iTunes doing? To answer that I again turned to DTrace and used the following enabling to see what functions were being called most frequently by iTunes (whose process ID was 332):

# dtrace -n 'pid332:::entry{ @[probefunc] = count(); }'
dtrace: description 'pid332:::entry' matched 264630 probes

I let it run for a while, made iTunes do some work, and the result when I stopped the script? Nothing. The expensive DTrace invocation clearly caused iTunes to do a lot more work, but DTrace was giving me no output.
Which started me thinking... did they? Surely not. They wouldn't disable DTrace for certain applications.

But that's exactly what Apple's done with their DTrace implementation. The notion of true systemic tracing was a bit too egalitarian for their classist sensibilities so they added this glob of lard into dtrace_probe() -- the heart of DTrace:

#if defined(__APPLE__)
        /\*
         \* If the thread on which this probe has fired belongs to a process marked P_LNOATTACH
         \* then this enabling is not permitted to observe it. Move along, nothing to see here.
         \*/
        if (ISSET(current_proc()->p_lflag, P_LNOATTACH)) {
            continue;
        }
#endif /\* __APPLE__ \*/

Wow. So Apple is explicitly preventing DTrace from examining or recording data for processes which don't permit tracing. This is antithetical to the notion of systemic tracing, antithetical to the goals of DTrace, and antithetical to the spirit of open source. I'm sure this was inserted under pressure from ISVs, but that makes the pill no easier to swallow. To say that Apple has crippled DTrace on Mac OS X would be a bit alarmist, but they've certainly undermined its efficacy and, in doing do, unintentionally damaged some of its most basic functionality. To users of Mac OS X and of DTrace: Apple has done a service by porting DTrace, but let's convince them to go one step further and port it properly.

About

Adam Leventhal, Fishworks engineer

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today