The story of lazy stabs

There's a dbx feature called "lazy stabs" that is clever, but a little confusing sometimes. I figured I'd talk about it a little to give an overview of what's happening. There are really two parts to the idea of "lazy stabs", one part is something we do all the time (demand loading most information when you first visit a source file), and the other part depends on how you compile your code (most debug info can be aggregated into the a.out or it can be left in the .o files).

You can read more about stabs and dwarf here:

stabs versus dwarf

Index debug info (stabs and dwarf)

Dbx will always demand-load line number information and other information about local symbols. So until you visit a source file for some reason, dbx won't bother loading the majority of the debug information for that file. This makes dbx start up much faster. When you first load the binary, only the global symbols and other index information are loaded.

For example, if you want to stop at "foo.h:12" then dbx needs to load the detailed source information for all files that include code from foo.h. Then dbx figures out which object files have code from line 12, and sets the breakpoint(s) you need.

In stabs, the index information is stored in the .stab.index section. In dwarf, the index information is stored in multiple sections with names like: .debug_pubnames, .debug_varnames etc.

reading debug info from .o files (stabs only)

Stabs were carefully designed not to depend on relocation records (which need to be resolved by the linker).

For most functions and variables, stabs uses the linker name of the function or variable to represent that object. At runtime dbx will access the global symbol table in the a.out and look up the symbol by name. For C++, this process uses the linker name of the symbol, and the character strings recorded as part of stabs can get very very huge. A few releases ago, we started using a compressed form of mangled names, which helped somewhat.

Because of that design, dbx can read most stabs from a .o file, and make sense of them. (Of course, the index stabs still come from the a.out.) This allows the a.out to have a smaller size on disk. Note that having stabs in a program never affects the run-time size of a program, or it's performance, because stabs are not loaded at runtime.)

This has bitten some people working on mozilla in the past: https://bugzilla.mozilla.org/show_bug.cgi?id=146154

Dwarf information encodes the absolute addresses of functions and variables, and so it needs to be relocated by the linker in order to make sense. That means we can't support this aspect of "lazy stabs" (really it's better called "dispersed stabs" or something like that). The a.out has to include all the dwarf information for the program. In exchange for this, dwarf takes up significantly less space in C++ programs that use long mangled names.

Larger a.outs (dwarf, or stabs with -xs)

When using stabs, you can compile with the -xs flag which will tell the compiler to collect all the stabs into the a.out. Dbx will still demand-load them, but it works better if you want to archive the binary with debug information, or if you want to clean up your build area, but keep a debuggable binary. When dbx is loading stabs from the .o files, if you move the directory that has the .o files in it (or move the .o files themselves), then you have to use the pathmap command in dbx to tell dbx where they went to.

(Aside: You might think -xs would logically be used at link time, but you need to use -xs at compile time so the compiler can tag the stabs sections with a flag that means "accumulate into a.out". This flag causes the linker to aggregate the stabs at link time. )

The increase in a.out size with dwarf will probably be a surprise to people who are used to smaller a.out's when using stabs, but I'm personally looking forward to it. I've had to deal with many many users over the years who wanted to send me a binary with debug information in order to reproduce a bug, but the a.out is normally missing the majority of the debug information with stabs. I had to tell them to rebuild their program with the -xs option, or else tar up the entire build tree and send it to me. With dwarf, that problem won't come up again. Everything will be in the a.out.

One important thing to remember is that the stabs and dwarf information isn't ever loaded into your program when it's run. So it won't affect the runtime performance or take up any memory when your program runs. The information only takes up disk space. And it's disk space that is also taken up by the .o files. So if you previously were saving your object files so that you could debug your program, you can stop doing that if you are using a compiler that emits dwarf. It also makes it easier to keep the non-stripped version of a binary that you strip and ship as part of a product.

Comments:

Chris,
This old blog post helped me a lot. Thanks!

Posted by Russ on April 30, 2008 at 04:52 AM PDT #

I can not find Sun documentation on its stabs format anywhere. The current GNU binutils-2.18 release does \*NOT\* support Sun stabs info completely and running objdump on a SunPRO C++ built binary produces stabs errors. Please contact me if you can provide the stabs format information so that binutils can support it properly.

Posted by Andrew on September 30, 2008 at 02:59 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

Chris Quenelle

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today