Xserver provider for DTrace

A few months ago I sent mail to Sun's dtrace and X11 internal mailing lists about something I'd been playing with:

After trying to absorb as much as possible at last week's dtrace classes, I sat down with the manual and tried things out for a bit to help it sink in before I forgot it all. One of the chapters I stumbled across was the one on adding your own probes to your own applications, which reminded me of some conversations I'd had with various people (Bart and Mahmood and others I've probably forgotten). Caffeine was consumed, and one thing led to another, and after a push in the right direction from the dtrace-interest list...

# dtrace -l -n 'Xserver\*:::'
   ID   PROVIDER            MODULE                FUNCTION NAME
    4 Xserver848              Xsun                Dispatch request-start
    5 Xserver848              Xsun                Dispatch request-done
    6 Xserver848              Xsun              InitClient client-connect
    7 Xserver848              Xsun         CloseDownClient client-disconnect 

# dtrace -q -n 'Xserver\*:::request-start { t = vtimestamp } \\
  Xserver\*:::request-done { \\
   printf("Client: %3d Request: %20s Size: %5d Time: %10d\\n", \\
          arg3, copyinstr(arg0), arg2, vtimestamp -t)}'
Client:   4 Request:  X_SetClipRectangles Size:     5 Time:      46736
Client:   4 Request:           X_ChangeGC Size:     4 Time:      16307
Client:   4 Request:           X_CopyArea Size:     7 Time:      68328
Client:   4 Request:  X_SetClipRectangles Size:     5 Time:      26480
Client:   4 Request:           X_ChangeGC Size:     4 Time:       8833
Client:   4 Request:  X_PolyFillRectangle Size:     5 Time:      43680
Client:   4 Request:  X_SetClipRectangles Size:     5 Time:      28506
Client:   4 Request:           X_ChangeGC Size:     4 Time:      11920
Client:   4 Request:           X_CopyArea Size:     7 Time:      46566

[sitting at the dtlogin screen, watching the cursor blink - it could probably cache the GC's and clip lists and use a little less bandwidth and CPU to blink that cursor]

This script waits for the next client to connect, keeps a count of all the requests it makes, then prints the count and exits when that client does:

#!/usr/sbin/dtrace -s

string Xrequest[uintptr_t];

Xserver$1:::client-connect
/clientid == 0/
{
    clientid = arg0;
    printf("\\nClient %d connected - tracing requests from it\\n", clientid);
}

Xserver$1:::client-disconnect
/clientid == arg0/
{
    printf("\\nClient %d disconnected - ending trace\\n", clientid);
    exit(0);
}

Xserver$1:::request-start
/Xrequest[arg0] == ""/
{
    Xrequest[arg0] = copyinstr(arg0);
}

Xserver$1:::request-start
/arg3 == clientid/
{
    @counts[Xrequest[arg0]] = count();
}

This started out as a plaything to give me something to learn dtrace with, but it looks useful to me and could easily turn into something more if others see uses for it. Since some of these probe points are directly in the hottest code path of our benchmarks, we'd have to make sure that they don't affect our benchmark scores too much, but that shouldn't be much of a problem. Yes, you can do much of this with xscope, but this doesn't require tunnelling everything through the slow xscope proxy server and then finding some way to make sense of the huge output logs.

And I've just put probe points in the easiest and most obvious places - there's other interesting places we could put in probes - when a client does a grab for instance, or on outgoing events and/or errors.

So I guess this is also a request for comments - would others find this useful? What probe points would be useful to use in the X server and what data would you like to get out at those probe points?

For instance, this is what I picked to make available in the current probes:

request-start: request name/extension name, minor code (for extensions), request length, client id, client sequence
request-done: request name/extension name, minor code (for extensions), request length, client id, result code
client-connect & disconnect: client id

After some feedback from people, I made some refinements to the existing probes, and added some more probes, then made it available for internal use. Since then I've added a few more probes to help in tracking down the CDE window manager pixmap leak I blogged about earlier. And now, I've made it available outside Sun as well, so more people can try it out, see if the probes available are useful or if they should be modified or expanded, and give feedback on it. At some point I hope to integrate this directly into both the Xsun and Xorg servers delivered in Solaris, as well as into the open source Xorg server code, but I'd like to get some more experience with it from more people first.

After you install it, you should see this set of probes available: (the number after "Xserver" in the provider name is the process id of the currently running Xorg server process)

# dtrace -l -n 'Xserver\*:::'
   ID   PROVIDER            MODULE                          FUNCTION NAME
    4 Xserver1335              Xorg    FreeClientNeverRetainResources resource-free
    5 Xserver1335              Xorg                FreeResourceByType resource-free
    6 Xserver1335              Xorg                      FreeResource resource-free
    7 Xserver1335              Xorg                          Dispatch request-done
    8 Xserver1335              Xorg           EstablishNewConnections client-connect
    9 Xserver1335              Xorg          AllocLbxClientConnection client-connect
   10 Xserver1335              Xorg        CloseDownRetainedResources client-disconnect
   11 Xserver1335              Xorg                   CloseDownClient client-disconnect
   12 Xserver1335              Xorg                    ProcKillClient client-disconnect
33598 Xserver1335              Xorg                          Dispatch client-disconnect
33667 Xserver1335              Xorg                       AddResource resource-alloc
33668 Xserver1335              Xorg                          Dispatch request-start
33669 Xserver1335              Xorg                  ClientAuthorized client-auth
33670 Xserver1335              Xorg                  FreeAllResources resource-free
33671 Xserver1335              Xorg               FreeClientResources resource-free

One of the example scripts I've also made available, client-watch.d reports every client that connects and when it disconnects a count of how many X requests it made and how much time the X server spent processing them. For example, I captured this from logging into and then out of a Java Desktop System (GNOME 2.6) session on Solaris 10:

   connect -> id:    8
 client id -> id:    8 is from local process 1706 (/usr/bin/gnome-session)
1706:   /usr/bin/gnome-session
[...]
   connect -> id:   14
 client id -> id:   14 is from local process 1831 (/usr/bin/nautilus)
1831:   nautilus --no-default-window --sm-client-id default3
   connect -> id:   15
 client id -> id:   15 is from local process 1833 (/usr/bin/gnome-volcheck)
1833:   gnome-volcheck -i 30 -z 3 -m cdrom,floppy,zip,jaz,dvdrom --sm-client-id default
  connect -> id:   16
 client id -> id:   16 is from local process 1857 (/usr/lib/clock-applet)
1857:   /usr/lib/clock-applet --oaf-activate-iid=OAFIID:GNOME_ClockApplet_Factory --oaf
   connect -> id:   17
 client id -> id:   17 is from local process 1831 ()
1831:   nautilus --no-default-window --sm-client-id default3
   connect -> id:   18
 client id -> id:   18 is from local process 1859 (/usr/lib/wnck-applet)
1859:   /usr/lib/wnck-applet --oaf-activate-iid=OAFIID:GNOME_Wncklet_Factory --oaf-ior-
   connect -> id:   19
 client id -> id:   19 is from local process 1867 (/usr/lib/gnome-netstatus-applet)
1867:   /usr/lib/gnome-netstatus-applet --oaf-activate-iid=OAFIID:GNOME_NetstatusApplet
   connect -> id:   20
 client id -> id:   20 is from local process 1875 (/usr/lib/mixer_applet2)
1875:   /usr/lib/mixer_applet2 --oaf-activate-iid=OAFIID:GNOME_MixerApplet_Factory --oa
   connect -> id:   21
 client id -> id:   21 is from local process 1877 (/usr/lib/notification-area-applet)
1877:   /usr/lib/notification-area-applet --oaf-activate-iid=OAFIID:GNOME_NotificationA
[... logout of JDS ...]
disconnect -> id:   16, lifetime: 8508 ms, requests: 364 (55 ms of CPU time)
disconnect -> id:   12, lifetime: 9597 ms, requests: 979 (50 ms of CPU time)
disconnect -> id:   20, lifetime: 6908 ms, requests: 286 (2 ms of CPU time)
disconnect -> id:   19, lifetime: 7157 ms, requests: 424 (53 ms of CPU time)
disconnect -> id:   14, lifetime: 9466 ms, requests: 940 (1532 ms of CPU time)
disconnect -> id:   15, lifetime: 9389 ms, requests: 119 (0 ms of CPU time)
disconnect -> id:   13, lifetime: 9542 ms, requests: 4544 (209 ms of CPU time)
disconnect -> id:   21, lifetime: 6718 ms, requests: 308 (1 ms of CPU time)
disconnect -> id:    8, lifetime: 15691 ms, requests: 1607 (1530 ms of CPU tim
disconnect -> id:    7, lifetime: 15851 ms, requests: 16 (0 ms of CPU time)
disconnect -> id:   10, lifetime: 11327 ms, requests: 130 (0 ms of CPU time)
disconnect -> id:   17, lifetime: 8483 ms, requests: 16 (0 ms of CPU time)
disconnect -> id:   18, lifetime: 7402 ms, requests: 789 (14 ms of CPU time)

I've also made the request itself available to copy in so you can get any part of it you want. An example of tracing the CreatePixmap and FreePixmap calls from a single client during a JDS session:

# ./client-pixmaps.d 1306 \\"/usr/bin/nautilus\\"
Creating pixmap: id: 0x1c00004 size: 1,1
Creating pixmap: id: 0x1c0001e size: 48,35
Creating pixmap: id: 0x1c0002a size: 48,48
Creating pixmap: id: 0x1c0002c size: 48,48
Freeing pixmap: id: 0x1c0001e
Freeing pixmap: id: 0x1c00025
Creating pixmap: id: 0x1c0002d size: 48,48
Creating pixmap: id: 0x1c0002f size: 48,48
Freeing pixmap: id: 0x1c0002a
Freeing pixmap: id: 0x1c0002c
Creating pixmap: id: 0x1c00030 size: 48,48
Creating pixmap: id: 0x1c00032 size: 48,48
Freeing pixmap: id: 0x1c0002d
Freeing pixmap: id: 0x1c0002f
Creating pixmap: id: 0x1c0003f size: 2,2
Creating pixmap: id: 0x1c00041 size: 1024,768
Creating pixmap: id: 0x1c00025 size: 48,35
Creating pixmap: id: 0x1c00047 size: 48,48
Creating pixmap: id: 0x1c00049 size: 48,48
Freeing pixmap: id: 0x1c00030
Freeing pixmap: id: 0x1c00032
Creating pixmap: id: 0x1c0004a size: 1024,768
Creating pixmap: id: 0x1c0004e size: 1,1
Freeing pixmap: id: 0x1c0004e
Creating pixmap: id: 0x1c00051 size: 1,1
Freeing pixmap: id: 0x1c00051
Freeing pixmap: id: 0x1c0004a
Creating pixmap: id: 0x1c00059 size: 1,1
Freeing pixmap: id: 0x1c00059
Creating pixmap: id: 0x1c0005b size: 109,496
Freeing pixmap: id: 0x1c0005b
Freeing pixmap: id: 0x1c00064
Creating pixmap: id: 0x1c00075 size: 1024,768
Freeing pixmap: id: 0x1c00075
Creating pixmap: id: 0x1c00064 size: 1024,768
Creating pixmap: id: 0x1c00086 size: 1024,768
Freeing pixmap: id: 0x1c00086
Creating pixmap: id: 0x1c00097 size: 199,384
Freeing pixmap: id: 0x1c00097
Creating pixmap: id: 0x1c000a0 size: 199,384
Freeing pixmap: id: 0x1c000a0
Creating pixmap: id: 0x1c000a9 size: 1024,768
Freeing pixmap: id: 0x1c000a9
Freeing pixmap: id: 0x1c00047
Freeing pixmap: id: 0x1c00049
Freeing pixmap: id: 0x1c00041

(This simple example just traces - but I'm sure you could enhance it or write a perl script to post-process the output to find pixmap leaks without much trouble, and even from just the trace I note nautilus is creating pixmaps the size of the entire root window (1024 x 768 due to the video card currently in this test machine) more often than I expected it to - all I did for this test was login and choose the menu item to log out.)

Please let me know if you find this useful, find more places that probes would be useful, or have other suggestions. You can reach me via e-mail, comments on this blog, or share with myself and other interested people in the OpenSolaris discussion forums/mailing lists for the dtrace and/or the X Window System.

Meanwhile, if you've got your own applications you'd like to add probes to like this, see Bart's blog on putting developer-defined DTrace probe points in an application, Alan H's blog on statically defined dtrace probes, and the Statically Defined Tracing for User Applications chapter of the DTrace manual.

[Technorati Tags: , , , ]

Comments:

Are you the Alan Coopersmith that co-authored Quadpan with Guppy while at Lockheed in the 80s? A Brown math PhD? If so, Hi! - Randy

Posted by Randy Parker on September 16, 2005 at 06:08 AM PDT #

Alan, I am trying to use Solaris Express and having an annoying problem in which Xorg freezes on my i85x Intel video chipset. Can you give me some guidance as how to try to debug it? Can dtrace help me? Do you have any other idea I can try? Thanks in advance. -- Douglas

Posted by Douglas Atique on December 22, 2005 at 09:35 PM PST #

Post a Comment:
Comments are closed for this entry.
About

Engineer working on Oracle Solaris and with the X.Org open source community.

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle, the X.Org Foundation, or anyone else.

See Also
Follow me on twitter

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today