One thing that performance people like to do is actually know what instructions are executed by a CPU. Crazy, I know. You can disassemble the program to know what specific machine instructions compose the program, but you still don't know what instructions specifically are executed at run-time due to branching in the code induced by the input to the program.
So way back when, clever programmers found ways to instrument their code ( or more often, somebody else's code ) in order to determine what instructions a CPU was executing. There are lots of techniques for doing this, but I'd like to talk about the one I implemented and was disclosed in a short paper that was published in the May 2004 issue of Research Disclosure Journal. I'm happy to have been published, but the two page limit did sort of cramp my style.
Some computer architectures provide a spiffy feature, often referred to as "trap on branch". It is possible to build a pretty efficient instruction tracing tool using trap on branch since it interrupts the code under observation only at the end of a basic block, rather than on every instruction executed. On average a SPARCv9 basic block consists of 5 instrutions, so using trap on branch to trace would trap 80% less than trapping on every instruction. Wait for it. Unfortunately, SPARCv9 does not define a trap on branch. Oh if only.
Work With What You've Got
Despite the lack of branch on trap in SPARCv9, there are some other
architectural goodies defined which the intrepid hacker could exploit. The first thing I looked at was the Instruction Breakpoint Register( Ok, technically not defined by SPARCv9, it's a USIII implementation defined extension ), which caused a breakpoint trap to fire when an instruction matched a user definable mask. Originally intended as a means of hot-patching broken instructions in the field, this feature never really worked as designed and was disabled in USIII+.
Another cool feature that doesn't get alot of love is the user trap handler facility, described in __sparc_utrap_install(2) and in the SPARC architecture manual (verson 9). It allows a user to define and install a custom trap handler for a specific (but diverse) set of exceptions.
Now that is pretty cool!
You Got Your Debugger In My Instruction Tracer!
Skipping ahead alittle bit, I had the idea of combining an old debugger technique for breakpointing with SPARC user trap handlers to create a poor-man's trap on branch. Many debuggers implement breakpointing by inserting an illegal instruction at the specified address in the program and catching the illegal instruction trap generated when the client program executes it. The debugger then patches up the replaced instruction or emulates it, whichever seemed easier to do at the time. An instruction tracing tool could patch up the
victim target application, replacing all it's control transfer instructions (CTIs) with ILLTRAP instructions ( SPARC instructions whose specific job is cause an illegal instruction trap ). Then when the target runs, every time it tries to branch it causes a trap which we can intercept with a user trap handler. The handler could write a record to a trace file and then fix up the broken branch and restart the target. Simplicity itself!
It's Never That Easy
Where to start. First, user trap handling for ILLTRAP was broken as detailed in bug 4650095. Second, there's the small matter of introducing ILLTRAPs into a program. Third, what to do with the CTIs that you've replaced with ILLTRAPs and all the trace output you will generate?
I discovered that an LD_AUDIT shared object would be an excellent way to introduce the trace instrumentation into a target application. The runtime linker audit interfaces are described adequately in rtld_audit(3ext) and in excellent detail in the Linker and Libraries Guide. The audit interfaces give the determined hacker a great deal of freedom to mess with an executable before control is passed to it from the runtime linker. Another advantage of this technique is the instrumenting code and the target code all share the same address space, so it's very easy to find and replace all the CTIs in the target code with ILLTRAPs.
A Commercial Plug And Then A Pause For Breath
As an engineer for Sun, I have found these three books to be invaluable:
- The SPARC Architecture Manual, version 9
- Solaris Interals
- Linker And Libraries Guide
The Linker and Library Guide in particular is an amazing piece of work, especially when you learn that it's written by the same guys who write the linker code. Those guys rock! All that's missing is a blog from Siezo!
Tomorrow, I'll post some condensed notes on the design of a user-land instruction tracing tool built on user traps called Btrace3, and why it ultimately was a flop.