Friday Mar 11, 2011

DTrace script for who exec'd a process

I was wondering what process was executing logadm on a test system so I wrote a simple dtrace one-liner:

dtrace -q -n 'proc:::exec { self->parent = execname;}' -n 'syscall::exec*:return /execname == "logadm"/ { printf("%Y %s execs %s\\n",walltimestamp,self->parent,curpsinfo->pr_psargs); }'

 Here is sample output:

2011 Mar 11 17:35:00 sh execs /usr/sbin/logadm

In this case, it turned out to be cron, but I also found the script useful to check if and when a process was called in other cases.  Actually, this example was not too useful, but I didn't want to show the actual bug that I was working on.  However, I ended up using that DTrace one-liner many times this week.

Wednesday Jul 02, 2008

simple DTrace script to print function argument

Someone asked me for a DTrace script to print out the argument to an arbitrary function. I'm not exactly sure if this is what he meant, so I wrote a script which lets you specify a function name and the position of the string argument that you want printed out and then the command to run.  Since it's a script, you can modify to print out integer args or to connect to a running process, etc...

The script is called printarg.d.  You have to run it either as root or with sufficient privileges in /etc/user_attr.  For example, this entry allows me to run dtrace scripts:

uejio::::type=normal;defaultpriv=basic,dtrace_user,dtrace_proc,dtrace_kernel

Anyway, here's an example:

$ printarg.d XLoadQueryFont 1 -c dtcalc
XLoadQueryFont: arg1=-dt-interface user-medium-r-normal-m\*-\*-\*-\*-\*-\*-\*-\*-\*
XLoadQueryFont: arg1=-dt-interface system-medium-r-normal-m sans-17-170-72-72-p-97-iso8859-1
XLoadQueryFont: arg1=-dt-interface user-medium-r-normal-m sans-14-140-72-72-m-90-iso8859-1


Here's another example:

$ printarg.d fopen 0 -c dtcalc
fopen: arg0=/home/uejio/.Xauthority
fopen: arg0=/usr/openwin/lib/locale/locale.alias
fopen: arg0=/usr/openwin/lib/locale/locale.dir
fopen: arg0=/usr/openwin/lib/locale/locale.alias
fopen: arg0=/usr/openwin/lib/locale/locale.alias
fopen: arg0=/usr/openwin/lib/locale/locale.dir
fopen: arg0=/usr/openwin/lib/X11/Xcms.txt
fopen: arg0=/home/uejio/.icons/blueprint-cursor-theme/cursors/arrow
fopen: arg0=/home/uejio/.icons/blueprint-cursor-theme/index.theme
fopen: arg0=/usr/share/icons/blueprint-cursor-theme/cursors/arrow
fopen: arg0=/usr/share/icons/blueprint-cursor-theme/index.theme
fopen: arg0=/usr/share/pixmaps/blueprint-cursor-theme/cursors/arrow
fopen: arg0=/usr/share/pixmaps/blueprint-cursor-theme/index.theme
fopen: arg0=/home/uejio/.icons/default/cursors/arrow
fopen: arg0=/home/uejio/.icons/default/index.theme
fopen: arg0=/usr/share/icons/default/cursors/arrow
fopen: arg0=/usr/share/icons/default/index.theme
fopen: arg0=/usr/share/pixmaps/default/cursors/arrow
fopen: arg0=/usr/share/pixmaps/default/index.theme
fopen: arg0=/usr/dt/appconfig/icons/C/Dtcalc.l.bm
fopen: arg0=/usr/dt/appconfig/icons/C/Dtcalc.l_m.bm


Well, I thought this was pretty cool and so easy to do in DTrace, but your mileage may vary. ;-)

If anyone knows of a simpler way to calculate the args to pass in, please let me know.

Wednesday Nov 14, 2007

DTrace Tutorial for X Window Programmers

Here's some notes on a DTrace presentation I gave to our desktop sustaining group. It's geared towards application debugging esp. for X Window System programmers and not for kernel debugging.

DTrace is usually thought of as a tool for kernel debugging. However, I have found it very useful for user level debugging, too. It's especially useful when first debugging an issue that you don't know where to begin and for issues involving applications that are already running.

I usually use DTrace for debugging the call stack.  That is, trying to figure out what functions are being called, by whom, and with what arguments.  DTrace also lets you look at return values. I've also used DTrace for performance issues such as determining how many times a function is called and how long that took.

Useful One-liners:

Here are some one-liners for client debugging.  There are a number of useful ones from Brendan Gregg's website at:

http://www.brendangregg.com/DTrace/dtrace_oneliners.txt

Here's one which prints out new processes:

dtrace -qn 'syscall::exec\*:return { printf("%Y %s\\n",walltimestamp,curpsinfo->pr_psargs); }'

If I run gnome-terminal, I see the following output:

2007 Nov 14 16:56:17 gnome-terminal
2007 Nov 14 16:56:18 gnome-pty-helper
2007 Nov 14 16:56:18 pt_chmod 4
2007 Nov 14 16:56:18 /usr/lib/utmp_update testuser /5 pts/5 19917 7 0 0 1195088178 373248 0 000000000
2007 Nov 14 16:56:18 bash

Here's one for looking at files opened by processes:

dtrace -n 'syscall::open\*:entry { printf("%s %s",execname,copyinstr(arg0)); }'

If I run gnome-terminal again, I get a lot of output:

...
  0  43837      open64:entry gnome-terminal /usr/share/themes/blueprint/gtk-2.0/vertStepDwnButton.png
  0  43837      open64:entry gnome-terminal /usr/share/themes/blueprint/gtk-2.0/menuBorderButton.png
  0  43453      open:entry nscd /etc/user_attr
  0  43453      open:entry utmpd /proc/19924/psinfo
  0  43453      open:entry gconfd-2 /export/home/testuser/.gconf/apps/panel/profiles/default/applets/...
  0  43453      open:entry gconfd-2 /export/home/testuser/.gconfd/saved_state.tmp
...

Try doing this with dbx or truss or any other tool, especially the first example!

Some things to notice about these one-liners:

The -q option means "quiet".  The -n means that the next argument is a probe name.  In these examples, we're looking at system calls so we use the syscall provider and then specify which function call and that we're looking at the entry to a function.  This probe fires at the entry to the function.

The arguments to the function are accessible via the variables arg0, arg1, arg2, etc.  So, in the open example, we're printing the first argument to the function open which is the pathname.  For strings variables, you need to use the copyinstr() function to print the contents of the string.  For integers you don't have to do any special conversion.

However, I found that it's not usually system calls that I want to look at, but rather user library calls.  Unfortunately, you can't just do something like:

# dtrace -n 'syscall::XOpenDisplay:entry'

You will get the error:

dtrace: invalid probe specifier syscall::XOpenDisplay:entry: probe
description syscall::XOpenDisplay:entry does not match any probes

DTrace only knows about system calls, but you can use the pid provider to look at all the calls in a particular process.  For example "dtrace -l" will list out the probes. You can limit that also with -n.  Let's look at the probes for metacity for an Xlib call XMoveWindow:

dtrace -l -n "pid`pgrep metacity`::XMove\*:"
   ID   PROVIDER            MODULE                          FUNCTION NAME
44083   pid19834       libX11.so.4                       XMoveWindow return
44084   pid19834       libX11.so.4                       XMoveWindow entry
44085   pid19834       libX11.so.4                       XMoveWindow 0
44086   pid19834       libX11.so.4                       XMoveWindow 1
...


Lots of output.  There are probes for both entry and return points and also arbitrary instructions.  You could trace the execution of instructions within a function as well.  However, I find dbx is probably a much easier tool to use for doing that.

In my examples, I'm only going to mention entry and return points.

So, now, let's take a look at a simple DTrace script, libX11.d. This script  just traces all calls to libX11:

# ./libX11.d `pgrep gnome-terminal`
dtrace: script './libX11.d' matched 1629 probes
CPU     ID                    FUNCTION:NAME
  0  44400                   XPending:entry
  0  44401             _XEventsQueued:entry
  0  44326                    _XFlush:entry
  0  44309                 _XFlushInt:entry
  0  44402     _X11TransBytesReadable:entry
  0  44403 _X11TransSocketBytesReadable:entry
...

Wow, lots of calls to XPending() and other functions.  We probably don't want to debug those, so before we remove them, we can format the output nicer by adding to the libX11.d script:

#pragma D option flowindent

Now we can see that XPending is calling these functions and that's probably not a very interesting function to look at, so, we can get rid of them by setting a flag in the entry probe to XPending(). Then in the return probe of XPending unset the flag.  When the flag is set, don't print out the function calls and when it is not set, then print out calls.

However, I'm also going to remove the flowindent because the indenting is sometimes annoying.  Also, I'm going to add a "quiet" option which will cause DTrace to not print out every probe that fires. So, I need to add my own printfs.  This results in, libX11_pending.d:

# ./libX11_pending.d `pgrep gnome-terminal`
XNextEvent called
_XDeq called
XFilterEvent called
XNextEvent called
_XDeq called
...
XChangeGC called
_XUpdateGCCache called
XChangeGC called
XChangeGC called
XSetClipRectangles called
_XSetClipRectangles called
XSetTSOrigin called
XFillRectangle called
...

So, we see lots and lots of libX11 functions being called.  Well, this also is too much information and probably not useful.  Suppose, we wanted to instead limit the calls to a specific type of call.  So, let's look at the next example.  libX11_grab.d.  This example prints out all the calls to any Xlib grab functions.  It also prints out the stack trace of the user process (gnome-terminal in this case) whenever the grab or ungrab function is called.  Printing out stack traces for grabs can be a problem when running in dbx since you probably can't type in the dbx window if client has a keyboard or pointer grab. So, DTrace is definitely the better tool here. In this example, I press the mouse button on the Edit menu item and get a popdown menu.

# ./libX11_grab.d `pgrep gnome-terminal`
XGrabPointer called

              libX11.so.4`XGrabPointer
              libgdk-x11-2.0.so.0.400.9`gdk_pointer_grab+0x180
XGrabKeyboard called

              libX11.so.4`XGrabKeyboard
              libgdk-x11-2.0.so.0.400.9`gdk_keyboard_grab+0x6c
...

More Complex Examples:

Now let's look at a complex example involving two processes.  Suppose, we want to know what happens in one processs when another process calls some function. The script trace2processes.d takes 3 arguments--the process id of the two processes and the function to key off of.  For example, if I want to know what the Xserver is doing when metacity calls XMoveWindow(), I would execute:

# ./trace2processes.d `pgrep Xnest` `pgrep metacity` XMoveWindow
metacity: XMoveWindow: enters
metacity: XMoveWindow: returns
Xnest: xnestGetImage: return = 1
Xnest: WriteToClient: entering
Xnest: WriteToClient: return = 2400
Xnest: DoGetImage: return = 0
Xnest: ProcGetImage: return = 0
Xnest: FlushAllOutput: entering
...

In my example, I am using the Xnest server since I was demo'ing this via a VNC session to engineers in Ireland and India and had Xnest running in VNC.

Here's another example, xscope in DTrace:

xscope.d

This shows how to use some of the Xserver probes to implement a simple DTrace version of the X debugging tool called xscope. Xscope is a tool for viewing the X protocol between client and server.

It's a pretty complex example.

First, I need to run DTrace with -C option so that it uses the preprocessor to parse the #include lines.  Then, I need include X11 header files.  However, I can't include X11/Xlib.h because there is a structure which conflicts with DTrace  So, not all Xlib structures can be used in DTrace. Unfortunately, the most important structure that you can't access is the Display structure. The xscope.d script shows an example of how to cast structures in DTrace such as casting the xEvent structure to show the component fields of an XEvent. The output of xscope is detailed in an earlier blog here.

Handy DTrace scripts from the DTrace Toolkit at:

http://www.brendangregg.com/dtrace.html#DTraceToolkit

I've used:

execsnoop - Traces execs of all processes.

dtruss - Truss written in DTrace  Runs much faster than truss so it can be very useful when truss is too slow.

Also:

memleak.d - from http://blogs.sun.com/sanjeevb/.  Lists memory leaks using DTrace.  Much faster than libumem and good when dbx access checking fails.

But, mostly, I modify one-liners from:
http://www.brendangregg.com/DTrace/dtrace_oneliners.txt

Sorry, I don't know of any way to detect memory corruption using DTrace.  

Tuesday Nov 06, 2007

xscope in dtrace ver. 0.1

I've been playing with dtrace a lot lately to help debug issues with the Xserver and some X clients. Since Alan added the dtrace probes for the Xserver in Solaris 10, I've been meaning to rewrite xscope using dtrace. It's probably a much bigger job than I have the spare time for, but I did start working on a simply version called xscope.d.  This is version 0.1 and just prints a simple one line output for each X request and event and is based on Alan's sample scripts.

Here is some example output when xlogo is started:

request-start: from client=21 (), request = X_CreateWindow
request-done: from client=21 (), request = X_ChangeWindowAttributes, resultCode = 0
request-start: from client=21 (), request = X_ChangeWindowAttributes
client-auth: client=21, from local pid=7209 (/usr/openwin/bin/xlogo)
request-done: from client=21 (/usr/openwin/bin/xlogo), request = X_ChangeWindowAttributes, resultCode = 0

This shows that the xlogo probably calls XCreateWindow() then XChangeWindowAttributes().  One odd thing about this output is that the client-auth probe which should fire when the client first connects to the Xserver seems to be called after the client makes a X_CreateWindow request. I can't figure out if that's a bug in dtrace or the Xserver probes.

I also added an example in xscope.d which gives more detail for the PropertyNotify event, so for this example, I see:

send-event: to client = 21 (/usr/openwin/bin/xlogo), event type =  PropertyNotify (28)
PropertyNotify: window=0xa80001, atom=0x27, state=0, time=381056109

What that means is that xlogo is probably changing the atom 0x27.  What atom is that?  I can use xlsatoms to list all the atoms (using the -f "%x %s" option to display in hex) and find that 0x27 is WM_NAME.  So, this event corresponds to xlogo setting the name of the window. This is not particularly useful for this case, however, I have been working on a bug with focus events and modified this script to print out details for the FocusIn and FocusOut events.  My xscope.d script has been very useful for understanding what's happening with that bug.

Thursday Dec 15, 2005

dtrace for S9 wish

While debugging an Xlib issue on Solaris 9, I sure wished dtrace was available for that OS. I just wanted to see when the values of width and height to the XCreatePixmap call were greater than a certain value. A simple dtrace script such as:

#!/usr/sbin/dtrace -s
 
pid$target:libX11:XCreatePixmap:entry
/arg2 > 4095||arg3 > 4095/
{
	printf("width=%d, height=%d, depth=%d\\n",arg2,arg3,arg4);
}

would have done the trick. But, since dtrace isn't available for S9, I ended up creating a preload library and used LD_PRELOAD. So, something like:

#include <stdio.h>
#include <dlfcn.h>
#include <X11/Xlib.h>
#include <X11/Xresource.h>
#include <X11/Xatom.h>
/\* 
   compile with:
   cc -I/usr/openwin/include -o preloadpixmap.so.1 -G -K pic preloadpixmap.c
   then:
   setenv LD_PRELOAD ./preloadpixmap.so.1
\*/

Pixmap XCreatePixmap(display, d, width, height, depth)
     Display \*display;
     Drawable d;
     unsigned int width, height;
     unsigned int depth;
{
  Pixmap pix;
  static void \*(\* fptr)() = 0;
  if (fptr == 0) {
    fptr = (void \*(\*)())dlsym(RTLD_NEXT,"XCreatePixmap");
    if (fptr == NULL) {
      (void) printf("dlopen: %s\\n", dlerror());
      return NULL;
    }
  }
     if ((width>4095)||(height>4095)) {
        printf("XCreatePixmap: width=%d, height=%d, depth=%d\\n",
		width,height,depth);
     }
  pix = (Pixmap)((\*fptr)(display,d,width,height,depth));
  return pix;
}

dtrace would have been soo much easier...

 

Tuesday Dec 06, 2005

dtrace RFE--probe user functions across processes

I finally learned to use dtrace. I attended a talk by Bryan Cantrill on dtrace at a Solaris Desktop Summit organized by John Rice. Bryan is a great speaker. He is incredibly knowledgable, animated, and makes learning dtrace exciting. He covered so much material in a short time that I probably followed only about half of it, but that was enough to get started fiddling with it.

One thing that it would be nice to have in dtrace is the ability to probe user library functions across processes. dtrace does this very nicely for system calls, but not for any arbitrary function in userland. For example, I might want to find out how many calls to some GTK library function were made when logging in to JDS by all GTK apps. I can do this for one app using pid$target, but I couldn't figure out how to do this for multiple apps and aggregate for all apps. Again, this is for any arbitrary function in user space and not system calls. I didn't get a chance to ask Bryan this question, but other dtrace experts said that it can't be done. Rats. (I remember the days when truss couldn't follow user libraries and I was so happy when we added the -u option to trace any lib including a.out.)

I'm still excited about dtrace and look forward to using it in my daily work. Thanks Bryan!

I should also remember to blog about the Solaris Desktop Summit which so far has been a great success...

 

About

uejio

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today