« scatenv | Main

Dumping Stacks

Stack dumping seems like it should be an easy thing.  After all, you're just walking a linked list.  What the truth is that stack dumping in Oracle Solaris Crash Analysis Tool is one its most complicated parts.

Sure, generating a list of function/frame pointers is pretty easy, but it turns out there are vast amounts of useful data and clues in that stack that it's useful to dig up and display (cue the digging pirate).

The first thing to consider is that when Oracle Solaris takes a trap (any kind of fault, exception or interrupt, which includes system calls), it has to save all of the registers to the stack.  This happens on system calls, interrupts, and of course, traps caused by approaching system crashes.  The reason this is done is to remember the state of the current thread before totally switching contexts to a different stack, and the operating system, therefore, needs to preserve all of the registers should it ever return so it can restore state and continue.

That's the reason you will often see register dumps in the stack.  Getting the register values at a precise point in time is valuable information - particularly when it contains state data about something going wrong and we're preparing to crash the operating system.  

You can see this trap information by enabling the stk_trap scatenv setting, which is on by default.  You can also see the user thread registers for userland threads in a core by displaying the full stack data with stack -l.

Other things which arise are architecture-specific, and related to how the compiler tries to optimize things for a particular architecture.  On SPARC, for example, a useful optimization is re-using stack space for functions we'll never return to.

For example, functions which look like:

  funcA() {
    return (funcB());
  } 
  callerFunc() {
    funcA();
  }

have no reason to return to funcA ever again.  What the compiler does in such situations is to pop the stack space it reserved for funcA for funcB's use.  We coined the term "recycled frame" for this.

The way this would normally appear in the stack is this:

  funcB+offset()
  callerFunc+offset()

which means the person looking at this stack has no information about funcA ever having been called.  If they then went to look at callerFunc, they'd see it never calls funcB.  But, if you look at callerFunc+offset, you can see a call to funcA, which isn't the next thing in the stack.  Hence you can see why the tool was enhanced to show:

  funcB+offset()
  funcA() - frame recycled
  callerFunc+offset()

for these cases.  Display of the stack with recycled frames is controlled by the stk_recycled scatenv setting, which is on by default.

Note that if there are two such optimizations in the calling sequence, there's no useful way to dig up anything but the first step.  The codepath command was written to search for call linkages between such functions.

Something similar is also done for leaf functions.  Leaf functions are functions which never call anything else.  An optimization done for those is that they don't have to necessarily do a save if they're short enough to operate in the volatile global and output registers (%g and %o on SPARC).

Knowledge of this is used when walking the stack as the "next" function down the stack will use the same stack pointer as we are using now.

There are other interesting bits done in the stack which are worth noting.  For example, as part of the ABI on SPARC 64-bit, stack pointers are offset by a number so as to make them easily distinguishable from 32-bit stack pointers.  That is the STACK_BIAS, which is 2047.  So if you find an address that's in the range of stack addresses, but is misaligned (in this case, it ends up with a "1" as the last digit), it probably needs the STACK_BIAS added to it to get the number you actually want.  Note that 32-bit SPARC, and x86/x64 use a STACK_BIAS of 0.

For convenience, the frame pointers that Oracle Solaris Crash Analysis Tool uses already have the STACK_BIAS applied. 

Another useful concept to know is MINFRAME.  The ABI defines how functions pass arguments between functions, and on SPARC, when a save instruction is run, it saves some space for input (%i) and local (%l) registers of the caller to be stored, plus space for output registers (%o) for any functions it calls (note that for leaf functions, this is sometimes optimized to skip the output registers - something we've tagged a MINIFRAME).

Knowing that a function uses MINFRAME is a good clue that this is a short function, and makes no use of local variables beyond what it can get away with using the local and input registers.

Knowing when we switch between stacks is also useful information.  This means we've switched from the userland stack space to kernel, or from one kernel stack to another.  This happens on traps, and also happens when we've detected a stack overflow - if we're out of stack space where we are, we still need stack space to deal with it.  In Oracle Solaris Crash Analysis Tool you can see stack switches with the stk_switch scatenv setting.  It also tells you details about whose stack it is, such as the thread's kernel stack, a CPU's interrupt or idle thread's stack, or the ptl1_stk.

The ptl1_stk is a special space used for dealing with panics when we're already processing a trap.  When we process a trap, we first switch to the kernel's nucleus, which is low-level code for handling all the details required switching between userland and kernel.  However, when we are already in the nucleus on SPARC, and we take a trap, we could be processing a kernel stack overflow, which means we're in trouble in the low-level code, and Oracle Solaris sets aside a special stack space for dealing with those on SPARC - the ptl1_stk.

Stack overflows are another interesting area where the stack dumper can help.  Kernel stack space is a limited resource - typically only 1-2 pages of memory.  Kernel code needs to be aware of this, and not allocate too much stack space for local variables.  Problems still arise here, and you can examine stack space usage by enabling scatenv settings stk_s_fromend and stk_s_size (both disabled by default).  stk_s_fromend shows how far each frame is from the end of the stack.  stk_s_size shows the size of each frame. Each kernel stack also has an unmapped page of vmem assigned to it at the end so that any accesses past the end of the stack trigger a page fault which can't be resolved and thus results in a panic.  That page is referred to as the redzone.  This prevents stack overflows from corrupting a neighboring thread stack.

One of the most useful things the stack dumper can do is display arguments passed into a function.  On SPARC, arguments are passed in registers. However, those registers are re-used by the callee, and thus can't be relied upon to determine what was passed to the function. The passed-in values can often be determined by examining the assembly code in the caller to see what it put in the output registers (input registers for leaf functions).  Doing that manually is time-consuming, even if you've had a lot of practice.

The scatenv setting stk_args causes the stack dumper to attempt to calculate those passed-in values for you, and display it in the stack. It isn't perfect, and can't always determine the arguments, but saves a lot of time in most cases.  It only works for SPARC at this time.

There are a few other more obscure scatenv settings which control how the stack dumper behaves. 

  • stk_l_sym - decodes any numbers to a kernel symbol if possible in the long stack output
  • stk_l_symonly - any numbers that can be decoded as a kernel symbol are displayed as only the kernel symbol - without the  number
  • stk_s_addr - displays the address of each frame in the normal stack output
  • stk_s_regs - displays the values of the input registers (%i0 - %i5) in the stack output (less useful with the stack arguments available)
  • stk_s_sym - decodes any numbers to a kernel symbol if possible in the normal stack output
  • stk_trap_mmu_sfsr - display and decode the mmu_sfsr information available in SPARC trap frames
  • stk_trap_tstate - display and decode the tstate information available in SPARC trap frames
There is a vast amount useful information available in a thread stack.  The Oracle Solaris Crash Analysis Tool stack dumper has many options to control how much and which information is displayed - probably more than anyone will ever need.  However, new pieces come up all the time and will be added.

« scatenv | Main
Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

danaf

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today