By Jharres-Oracle on Dec 12, 2011
for these cases. Display of the stack with recycled frames is controlled by the stk_recycled scatenv setting, which is on by default.
Note that if there are two such optimizations in the calling sequence, there's no useful way to dig up anything but the first step. The codepath command was written to search for call linkages between such functions.
Something similar is also done for leaf functions. Leaf functions are functions which never call anything else. An optimization done for those is that they don't have to necessarily do a save if they're short enough to operate in the volatile global and output registers (%g and %o on SPARC).
Knowledge of this is used when walking the stack as the "next" function down the stack will use the same stack pointer as we are using now.
There are other interesting bits done in the stack which are worth noting. For example, as part of the ABI on SPARC 64-bit, stack pointers are offset by a number so as to make them easily distinguishable from 32-bit stack pointers. That is the STACK_BIAS, which is 2047. So if you find an address that's in the range of stack addresses, but is misaligned (in this case, it ends up with a "1" as the last digit), it probably needs the STACK_BIAS added to it to get the number you actually want. Note that 32-bit SPARC, and x86/x64 use a STACK_BIAS of 0.
For convenience, the frame pointers that Oracle Solaris Crash Analysis Tool uses already have the STACK_BIAS applied.
Another useful concept to know is MINFRAME. The ABI defines how functions pass arguments between functions, and on SPARC, when a save instruction is run, it saves some space for input (%i) and local (%l) registers of the caller to be stored, plus space for output registers (%o) for any functions it calls (note that for leaf functions, this is sometimes optimized to skip the output registers - something we've tagged a MINIFRAME).
Knowing that a function uses MINFRAME is a good clue that this is a short function, and makes no use of local variables beyond what it can get away with using the local and input registers.
Knowing when we switch between stacks is also useful information. This means we've switched from the userland stack space to kernel, or from one kernel stack to another. This happens on traps, and also happens when we've detected a stack overflow - if we're out of stack space where we are, we still need stack space to deal with it. In Oracle Solaris Crash Analysis Tool you can see stack switches with the stk_switch scatenv setting. It also tells you details about whose stack it is, such as the thread's kernel stack, a CPU's interrupt or idle thread's stack, or the ptl1_stk.
The ptl1_stk is a special space used for dealing with panics when we're already processing a trap. When we process a trap, we first switch to the kernel's nucleus, which is low-level code for handling all the details required switching between userland and kernel. However, when we are already in the nucleus on SPARC, and we take a trap, we could be processing a kernel stack overflow, which means we're in trouble in the low-level code, and Oracle Solaris sets aside a special stack space for dealing with those on SPARC - the ptl1_stk.
Stack overflows are another interesting area where the stack dumper can help. Kernel stack space is a limited resource - typically only 1-2 pages of memory. Kernel code needs to be aware of this, and not allocate too much stack space for local variables. Problems still arise here, and you can examine stack space usage by enabling scatenv settings stk_s_fromend and stk_s_size (both disabled by default). stk_s_fromend shows how far each frame is from the end of the stack. stk_s_size shows the size of each frame. Each kernel stack also has an unmapped page of vmem assigned to it at the end so that any accesses past the end of the stack trigger a page fault which can't be resolved and thus results in a panic. That page is referred to as the redzone. This prevents stack overflows from corrupting a neighboring thread stack.
One of the most useful things the stack dumper can do is display arguments passed into a function. On SPARC, arguments are passed in registers. However, those registers are re-used by the callee, and thus can't be relied upon to determine what was passed to the function. The passed-in values can often be determined by examining the assembly code in the caller to see what it put in the output registers (input registers for leaf functions). Doing that manually is time-consuming, even if you've had a lot of practice.
The scatenv setting stk_args causes the stack dumper to attempt to calculate those passed-in values for you, and display it in the stack. It isn't perfect, and can't always determine the arguments, but saves a lot of time in most cases. It only works for SPARC at this time.
There are a few other more obscure scatenv settings which control how the stack dumper behaves.
- stk_l_sym - decodes any numbers to a kernel symbol if possible in the long stack output
- stk_l_symonly - any numbers that can be decoded as a kernel symbol are displayed as only the kernel symbol - without the number
- stk_s_addr - displays the address of each frame in the normal stack output
- stk_s_regs - displays the values of the input registers (%i0 - %i5) in the stack output (less useful with the stack arguments available)
- stk_s_sym - decodes any numbers to a kernel symbol if possible in the normal stack output
- stk_trap_mmu_sfsr - display and decode the mmu_sfsr information available in SPARC trap frames
- stk_trap_tstate - display and decode the tstate information available in SPARC trap frames