running code from kernel dumps.
By timatworkhomeandinbetween on May 20, 2005
One off-beat program that comes in useful once or twice a year is a program I call Lazerus. Lazerus is fed a kernel crash dump and a configuration file and then starts executing the kernel code in the crash dump from a point defined in the configuration file. The registers are set up according to instructions in the configuration file. There is a SIGSEGV and a SIGBUS handler which use libkvm to page the correct pages from the crash dump into lazerus's 64 bit address space and then retry the execution. There are many things it can't do as it is running non- priviledged but it runs well enough to be able to watch code flow. I mainly use it to execute complex functions that have well defined actions that are too complex to do by hand, for example functions that calculate and follow a hash chain, where the hash function is based on some tunable and the hash chains are in the dump. It would take a long time to calculate exactly what hash chain was chosen and where it followed the chain to. With Lazerus you set up the registers like %g7 is the thread pointer, %i6 is the stack pointer and %o1..6 are the input arguments, set the pc and npc and then jump into the dump using setcontext().
So the first thing that happens is that a SIGSEGV is taken against the address in %pc, the handler mmap()s a 8k page of memory to an 8k aligned address that spans %pc, it then used kvm_read() to suck the contents of that page from the crash dump into the mmap()'ed space. When we return from the signal handler the kernel continues the program at %pc which now has the instructions copied from the kernel. Soon it will either drop of the end of the page and cause another page to be paged in, or it will branch/jump/call to a new address causing that page to come in. If it accesses data then we will fault against that address and pull it in, straight after pulling a page in we check to see if any of the memory patching directives in the configuration file should be applied to the page.
Version one of Lazerus used a complex method to detect the target of branches/jumps/calls and then used illegal instructions to jump to a handler to get access to the context after every instruction to allow single stepping.
Version two of Lazerus used a much simpler mechanism to perform single stepping. As soon as the 8k page of text was loaded procfs was used to put an execute watchpoint across the whole 8k page. The option to get the breakpoint signal after the instruction had executed was used so SIGTRAP was raised after every instruction allowing the context to be printed out. This worked much better than version one but some functions in the kernel caused an unexpected SIGSEGV, some investigation showed that the new libthread in solaris 9 and solaris 10 does a good job of ensuring that %g7 is set to the userland thread address causing any kernel function that uses CURRTHREAD to get the wrong data. Version two is much better but doesn't work.
Buried deep in my src/C directory is a program that emulates the more common sparc instructions. It has a software register file with 8 windows. It was used to investigate a problem with ld.so.1 which was almost impossible to debug until dtrace gave us the proc provider.
So the plan is to ditch the model of using a real sparc processor in userland to execute code from a crash dump and to use the software emulator ( there are several full function emulator available that are used to bringup new processors before silicon is ready but writing your own is always a good way to ensure you understand the instruction set) between each step and have it chew through the kernel code.
Todays goal, between working on escalations, was to get the code that processes the configuration file into decent shape. Over the years I have refined my option processing code, I tend not to use command lines as these can be forgotten but to put every thing in a configuration file so tidy code to process this file is essential. The current incarnation of option processing uses a structure to define the options, the type of data, the bounds and if necessary a function to call to affect the option, the actual code is only about 40 lines but the structure has all the work, hopefully it can be re-used.
Early next week I will get the kvm_read() code and context dumping code running, more then...