Dtrace should be used just like any other tool

I believe that we will have succeeded with dtrace when all kinds of folks, from sysadmins to developers start usign dtrace as a normal part of their troubleshooting toolkit.

As a simple example, today I noticed my workstation running with a load average of just above 1. Not unusual, except I really wan't having the machine do anything at the time. Unfortunately, I haven't the output from the commands as I addressed the problem this morning and it's way out of my scroll buffer now. I'm actually pulling the commands back out of my shell history.

Maybe we've got something fork/execing a lot (that won't show in prstat). OK, let's pull out dtrace and have a look at what's going on ...

# dtrace -n 'proc::exec_common:exec-success {trace(arg0);}'

No, that was pretty quiet. OK, let's look at what is getting scheduled.

# dtrace -n 'sched::resume:on-cpu {printf("%s: %d\\n", execname, pid);}'

OK, this is scrolling past relatively quickly, so it appears that we've got a bit of contect switching going on and pretty much always got something ready to run. This probe will print a line for every time we put something onto proc. How about using an aggregation to tidy things up a bit?

# dtrace -n 'sched::resume:on-cpu {@[execname,pid] = count();}'

Hmmmm, much nicer, I can see that sched and one particular instance of mozilla (I have to run a few in different profiles) is finding itself put onto proc much more often than anything else. The mozilla in question is running with pid 20051. Running pargs 20051 tells me a bit more about which instance it is. It's the one that I use generally and do my email with. At this point I could probably have killed off and restarted the mozilla and the problem would have gone away, but I was interested in what it was doing.

OK, so what are the system calls?

# dtrace -n 'syscall:::entry /pid == 20051/ { @[probefunc] = count();}

This showed me that we were spread pretty equally around an ioctl on fd3, a read on fd5 and a write on fd6. Running pfiles 20051 tells us that these file descriptors are pipes, and very little else.

If we were interested we could also aggregate on the user level stacks to each of the calls, like this.

# dtrace -n 'syscall::ioctl:entry /pid == 20051/ { @[ustack()] = count();}'
# dtrace -n 'syscall::read:entry /pid == 20051/ { @[ustack()] = count();}'
# dtrace -n 'syscall::write:entry /pid == 20051/ { @[ustack()] = count();}'

One interesting there here was that for each of the above probes, there was only one user stack. I suppose that shouldn't be surprising given that they are always calling with the same file descriptor.

It was also interesting that the other instance I had running at the time was not doing anything even vaguely similar. There are two main differences between them.

  • My general instance also does my email
  • My non-general instance only talks to our escalation tracking system, so it's only really one website

At this point I killed and restarted the mozilla, ran some of the commands before and after starting my mail. The load on the box was much lighter, but mail didn't make a difference. I can only conclude that a site I must have visited left something playing up in mozilla. Oh well, won't be the last time.

The point of me blogging this is to show just how easy it is to start getting to the bottom of problems that would be otherwise very difficult to observe. That's the whole idea of dtrace. It's a tool to let you observe the system and it can be used just as easily as you would use the others that I interspersed here (eg pfile, prstat and pargs).

As I have been saying to Jacob before every soccer match, "Get in and have a go". It's the only way to really start to appreciate the usefulness.

Comments:

Post a Comment:
Comments are closed for this entry.
About

* - Solaris and Network Domain, Technical Support Centre


Alan is a kernel and performance engineer based in Australia who tends to have the nasty calls gravitate towards him

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Links
Blogroll

No bookmarks in folder

Sun Folk

No bookmarks in folder

Non-Sun Folk
Non-Sun Folks

No bookmarks in folder