Since Solaris 2.0, the link-editors have provided a mechanism for tracing what they're doing. As this mechanism has been around for so long, plus I've used some small examples in previous postings, I figured most folks knew of its existence. I was reminded the other day that this isn't the case. For those of you unfamiliar with this tracing, here's an introduction, plus a glimpse of a new analysis tool available with Solaris 10.
You can set the environment variable LD_DEBUG to one or more pre-defined tokens. This setting causes the runtime linker, ld.so.1(1), to display information regarding the processing of any application that inherits this environment variable. The special token help provides a list of token capabilities without executing any application.
One of the most common tracing selections reveals the binding of a symbol reference to a symbol definition.
% LD_DEBUG=bindings main ..... 00966: binding file=main to file=/lib/libc.so.1 symbol `_iob' ..... 00966: binding file=/lib/libc.so.1 to file=main: symbol `_end' ..... 00966: 1: transferring control: main ..... 00966: 1: binding file=main to file=/lib/libc.so.1: symbol `atexit' ..... 00966: 1: binding file=main to file=/lib/libc.so.1: symbol `exit'
Those bindings that occur before transferring to main are the immediate (data) bindings. These bindings must be completed before any user code is executed. Those bindings that occur after the transfer to main, are established when the associated function is first called. These are lazy bindings.
Another common tracing selection reveals what files are loaded.
% LD_DEBUG=files main ..... 16763: file=libc.so.1; needed by main 16763: file=/lib/libc.so.1 [ ELF ]; generating link map ..... 16763: 1: transferring control: ./main ..... 16763: 1: file=/lib/libc.so.1; \\ filter for /platform/$PLATFORM/lib/libc_psr.so.1 16763: 1: file=/platform/SUNW,Sun-Blade-1000/lib/libc_psr.so.1; \\ filtered by /lib/libc.so.1 16763: 1: file=/platform/SUNW,Sun-Blade-1000/lib/libc_psr.so.1 [ ELF ]; \\ generating link map ..... 16763: 1: file=libelf.so.1; dlopen() called from file=./main \\ [ RTLD_LAZY RTLD_LOCAL RTLD_GROUP RTLD_WORLD ] 16763: 1: file=/lib/libelf.so.1 [ ELF ]; generating link map
This reveals initial dependencies that are loaded prior to transferring control to main. It also reveals objects that are loaded during process execution, such as filters and dlopen(3c) requests.
Note, the environment variable LD_DEBUG_OUTPUT can be used to specify a file name to which diagnostics are written (the file name gets appended with the pid). This is helpful to prevent the tracing information from interfering with normal program output, or for collecting large amounts of data for later processing.
In a previous posting I described how you could discover unused, or unreferenced dependencies. You can also discover these dependencies at runtime.
% LD_DEBUG=unused main ..... 11143: 1: file=libWWW.so.1 unused: does not satisfy any references 11143: 1: file=libXXX.so.1 unused: does not satisfy any references ..... 11143: 1: transferring control: ./main .....
Unused objects are determined prior to calling main and after any objects are loaded during process execution. The two libraries above aren't referenced before main, and thus make ideal lazy-loading candidates (that's if they are used at all).
Lastly, there are our old friends .init sections. Executing these sections in an attempt to fulfill the expectations of modern languages (I'm being polite here), and expected programming techniques, has been shall we say, challenging. .init tracing is produced no matter what debugging token you chose.
% LD_DEBUG=basic main ..... 34561: 1: calling .init (from sorted order): libYYY.so.1 34561: 1: calling .init (done): libYYY.so.1 ..... 34561: 1: calling .init (from sorted order): libZZZ.so.1 ..... 34561: 1: calling .init (dynamically triggered): libAAA.so.1 34561: 1: calling .init (done): libAAA.so.1 ..... 34561: 1: calling .init (done): libZZZ.so.1
Note that in this example, the topologically sorted order established to fire .init's has been interrupted. We dynamically fire the .init of libAAA.so.1 that has been bound to while running the .init of libZZZ.so.1. Try to avoid this. I've seen bindings cycle back into dependencies whose .init hasn't completed.
The debugging library that provides these tracing diagnostics is also available to the link-editor, ld(1). This debugging library provides a common diagnostic format for tracing both linkers. Use the link-editors -D option to obtain tracing info. As most compilers have already laid claim to this option, the LD_OPTIONS environment variable provides a convenient setting. For example, to see all the gory details of the symbol resolution undertaken to build an application, try:
% LD_OPTIONS=symbols,detail cc -o main $(OBJS) $(LIBS) ...
and stand back ... the output can be substantial.
Although tracing a process at runtime can provide useful information to help diagnose process bindings, the output can be substantial. Plus, it only tells you what bindings have occurred. This information lacks the full symbolic interface data of each object involved, which in turn can hide what you think should be occurring. In Solaris 10, we added a new utility, lari(1), which provides the Link Analysis of Runtime Interfaces.
This perl(1) script analyzes a debugging trace, together with the symbol tables of each object involved in a process. lari(1) tries to discover any interesting symbol relationships. Interesting, typically means that a symbol name exists in more than one dynamic object, and interposition is at play. Interposition can be your friend, or your enemy - lari(1) doesn't know which. But historically, a number of application failures or irregularities have boiled down to some unexpected interposition which at the time was hard to track down.
For example, a typical interposition might show up as:
% lari main [2:3]: foo(): /opt/ISV.I/lib/libfoo.so.1 [2:0]: foo(): /opt/ISV.II/lib/libbar.so.1 [2:4]: bar[0x80]: /opt/ISV.I/lib/libfoo.so.1 [2:0]: bar[0x100]: /opt/ISV.II/lib/libbar.so.1
Here, two versions of function foo(), and two version of the data item bar exist. With interposition, all bindings have resolved to the first library loaded. Hopefully the 3 callers of foo() expect the signature and functionality provided by ISV.I. But you have to wonder, do the 4 users of bar expect the array to be 0x80 or 0x100 in size?
lari(1) also uncovers direct bindings or symbols that are defined with protected visibility. These can result in multiple instances of a symbol being bound to from different callers:
% lari main [2:1D]: foo(): ./libA.so [2:1D]: foo(): ./libB.so
Again, perhaps this is what the user wants to achieve, perhaps not ... but it is interesting.
There are many more permutations of symbol diagnostic that can be produced by lari(1), including the identification of explicit interposition (such as preloading, or objects built with -z interpose), copy relocations, and dlsym(3c) requests. Plus, as lari(1) is effectively discovering the interfaces used by each object within a process, it can create versioning mapfiles that can be used as templates to rebuild each object.