Thursday May 24, 2007

Python and DTrace in build 65

A significant portion of the Xen control tools are written in Python, in particular xend. It's been somewhat awkward to observe what the daemon is doing at times, necessitating an endless cycle of 'add printf; restart' cycles. A while ago I worked on adding DTrace support to the Python packages we ship in OpenSolaris, and these changes have now made it into the latest build, 65.

As is the case with the other providers people have worked on such as Ruby and Perl, there's two simple probes for function entry and function exit. arg0 contains the filename, arg1 the function name, and arg2 has the line number. So given this simple script to trace the functions called by a particular function invocation, restricted to a given module name:

#!/usr/sbin/dtrace -ZCs

#pragma D option quiet

python$target:::function-entry
    /copyinstr(arg1) == $2 && strstr(copyinstr(arg0), $1) != NULL/ {
        self->trace = 1;
}

python$target:::function-return
    /copyinstr(arg1) == $2 && strstr(copyinstr(arg0), $1) != NULL/ {
        self->trace = 0;
}

python$target:::function-entry,python$target:::function-return
    /self->trace && strstr(copyinstr(arg0), $3) != NULL/ {
        printf("%s %s (%s:%d)\\n", probename == "function-entry" ? "->" : "<-",
            copyinstr(arg1), copyinstr(arg0), arg2);
}

We can run it as follows and get some useful results:

# ./pytrace.d \\"hg.py\\" \\"clone\\" \\"mercurial\\" -c 'hg clone /tmp/test.hg'
-> clone (build/proto/lib/python/mercurial/hg.py:65)
-> repository (build/proto/lib/python/mercurial/hg.py:54)
-> _lookup (build/proto/lib/python/mercurial/hg.py:31)
-> _local (build/proto/lib/python/mercurial/hg.py:16)
-> __getattribute__ (build/proto/lib/python/mercurial/demandload.py:56)
-> module (build/proto/lib/python/mercurial/demandload.py:53)
...

Of course, this being DTrace, we can tie all of this into general system activity as usual. I also added "ustack helper" support. This is significantly more tricky to implement, but enormously useful for following the path of Python code. For example, imagine we want to look at what's causing write()s in the clone operation above. As usual:

#!/usr/sbin/dtrace -Zs

syscall::write:entry /pid == $target/
{
        @[jstack(20)] = count();
}

END
{
        trunc(@, 2);
}

Note that we're using jstack() to make sure we have enough space allocated for the stack strings reported. Now as well as the C stack, we can see what Python functions are involved in the user stack trace:

# ./writes.d -c 'hg clone /tmp/test.hg'
...
              libc.so.1`_write+0x15
              libc.so.1`_fflush_u+0x36
              libc.so.1`fflush+0x43
              libpython2.4.so.1.0`file_flush+0x2a
              libpython2.4.so.1.0`call_function+0x32a
              libpython2.4.so.1.0`PyEval_EvalFrame+0xbdf
                [ build/proto/lib/python/mercurial/transaction.py:49 (add) ]
              libpython2.4.so.1.0`PyEval_EvalCodeEx+0x732
              libpython2.4.so.1.0`fast_function+0x112
              libpython2.4.so.1.0`call_function+0xda
              libpython2.4.so.1.0`PyEval_EvalFrame+0xbdf
                [ build/proto/lib/python/mercurial/revlog.py:1137 (addgroup) ]
              libpython2.4.so.1.0`PyEval_EvalCodeEx+0x732
              libpython2.4.so.1.0`fast_function+0x112
              libpython2.4.so.1.0`call_function+0xda
              libpython2.4.so.1.0`PyEval_EvalFrame+0xbdf
                [ build/proto/lib/python/mercurial/localrepo.py:1849 (addchangegroup) ]
              libpython2.4.so.1.0`PyEval_EvalCodeEx+0x732
              libpython2.4.so.1.0`fast_function+0x112
              libpython2.4.so.1.0`call_function+0xda
              libpython2.4.so.1.0`PyEval_EvalFrame+0xbdf
                [ build/proto/lib/python/mercurial/localrepo.py:1345 (pull) ]
              libpython2.4.so.1.0`PyEval_EvalCodeEx+0x732
              libpython2.4.so.1.0`fast_function+0x112
              148

              libc.so.1`_write+0x15
              libc.so.1`_fflush_u+0x36
              libc.so.1`fclose+0x6e
              libpython2.4.so.1.0`file_dealloc+0x36
              libpython2.4.so.1.0`frame_dealloc+0x65
              libpython2.4.so.1.0`PyEval_EvalCodeEx+0x75c
              libpython2.4.so.1.0`fast_function+0x112
              libpython2.4.so.1.0`call_function+0xda
              libpython2.4.so.1.0`PyEval_EvalFrame+0xbdf
                [ build/proto/lib/python/mercurial/localrepo.py:1849 (addchangegroup) ]
              libpython2.4.so.1.0`PyEval_EvalCodeEx+0x732
              libpython2.4.so.1.0`fast_function+0x112
              libpython2.4.so.1.0`call_function+0xda
              libpython2.4.so.1.0`PyEval_EvalFrame+0xbdf
                [ build/proto/lib/python/mercurial/localrepo.py:1345 (pull) ]
              libpython2.4.so.1.0`PyEval_EvalCodeEx+0x732
              libpython2.4.so.1.0`fast_function+0x112
              libpython2.4.so.1.0`call_function+0xda
              libpython2.4.so.1.0`PyEval_EvalFrame+0xbdf
                [ build/proto/lib/python/mercurial/localrepo.py:1957 (clone) ]
              libpython2.4.so.1.0`PyEval_EvalCodeEx+0x732
              libpython2.4.so.1.0`fast_function+0x112
              libpython2.4.so.1.0`call_function+0xda
              148

Creating a ustack helper

As anyone who's come across the Java dtrace helper source will know, creating a ustack helper is rather a black art.

When a ustack helper is present, it is called in-kernel for each entry in a stack when the ustack() action occurs (source). The D instructions in the helper action are executed such that the final string value is taken as the result of the helper. Typically for Java, there is no associated C function symbol for the PC value at that point in the stack, so the result of the helper is used directly in the stack trace. However, this is not true for Python, so that's why you see a different format above: the normal stack entry, plus the result of the helper in annotated form where it returned a result (in square brackets).

The helper is given two arguments: arg0 is the PC value of the stack entry, and arg1 is the frame pointer. The helper is expected to construct a meaningful string from just those values. In Python, the PyEval_EvalFrame function always has a PyFrameObject \* as one of its arguments. By having the helper look at this pointer value and dig around the structures, we can find pointers to the strings containing the file name and function, as well as the line number. We can copy these strings in, and, using alloca() to give ourselves some scratch space, build up the annotation string you see above.

Debugging helpers isn't particularly easy, since it lives and runs in probe context. You can use mdb's DTrace debugging facilities to find out what happened, and some careful mapping between the failing D instructions and the helper source can pinpoint the problem. Using this method it was relatively easy to get a working helper for x86 32-bit. Both SPARC and x86 64-bit proved more troublesome though. The problems were both related to the need to find the PyFrameObject \* given the frame pointer. On amd64, the function we needed to trace was passing the arguments in registers, as defined architecturally, so the argument wasn't accessible on the stack via the frame pointer. On SPARC, the pointer we need was stored in a register that was subsequently re-used as a scratch register. Both problems were solved, rather cheesily, by modifying the way the function was called.

Tags:

Thursday Nov 23, 2006

64-bit Python in Nevada build 53

Coming to you in build 53 of OpenSolaris is 64-bit Python, which I worked on with Laszlo Peter of JDS fame. This means Python modules can make use of 64-bit versions of libraries, as well as 64-bit plugins. You can run the 64-bit version of Python via the path /usr/bin/amd64/python (on x86). This path isn't quite set in stone yet, so don't rely on it.

This facility didn't previously exist on any OS, so we had to make some innovations in terms of how Python lets modules build and load. In particular the Makefile used by Python previously hard-coded certain compiler flags etc. We had to make this dynamic. Also, we had to make some modifications to where Python looks for .so files when loading modules. Previously it would just assume that, say, /usr/lib/python2.4/foo.so was of the correct word size. Now, if it's running 64-bit, it will look for /usr/lib/python2.4/64/foo.so.

Similarly, building a Python module using the 64-bit Python will automagically install the .so file in the right place. Thanks to their architecture-independence, we don't need the same tricks for the .pyc files.

The need for this arose from the continuing work on Solaris dom0's running under Xen. The kernel/hypervisor interfaces provided are not 64-bit clean in the sense that 32-bit tools cannot deal with 64-bit domains; as a result, we need to run (the Python-based) xend as a native binary.

As an added bonus, Laca has also upgraded to Python 2.4.4, which finally enables the curses module on Solaris; also fixed are some niggling problems with accidental regeneration of .pyc files.

Tags:

About

levon

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today