e martë Qer 14, 2005

Library Bindings - let's be a little bit more precise shall we

Library Bindings - let's be a little bit more precise shall we

Library Bindings - let's be a little bit more precise shall we

Well - now that OpenSolaris is officially open for business I think its well past time for another Blog entry from me. This blog entry will give a little history on how a executable (or shared objects) on Solaris binds to its various dependencies. My goal here is to give some insight into what we do today, as well has hi-light a alternative binding technology which you may or may not be familiar with, Direct Bindings with ld -Bdirect. The Direct Bindings model is a much more precise model when resolving reference between objects.

But first a little history on what we currently do. Solaris (and \*nix's in general) does the following when a process is executed. The kernel will load the required program (a ELF object) into memory and also load the runtime linker ( ld.so.1(1) ) into memory. The kernel then transfers control initially to the runtime linker. It's the runtime linkers job to examine the. program loaded and find any dependencies it has (in the form of a shared object), load those shared objects into memory, and then bind all of the symbol bindings (function calls, data references, etc...) from the program to each of those dependencies. Of coarse, as it loads each shared object it must in turn do the same examination on each of them and load any dependencies they require. Once all of the dependencies are loaded and their symbols have been bound - the runtime linker will fire the .init sections for each shared object loaded and finally transfer control to the executable, which calls main(). Most people think a process starts with main() but amazing things happen before we even get there.

Here we will specifically look at how the runtime linker binds the various symbol reference between all of the objects loaded into memory. Let's take a simple example first - how about a application which links against a couple of shared objects and then libc.

    % more \*.c

    void bar()
	printf("inside of bar\\n");

    void foo() {
	printf("inside of foo\\n");

    main(int argc, char \*argv[]){
	extern void foo();
	extern void bar();
	return (0);
    % cc    -G -o foo.so -Kpic foo.c -lc
    % cc    -G -o bar.so -Kpic bar.c -lc
    % cc    -o prog prog.c ./foo.so ./bar.so
We've now got a program, prog, which is bound against three shared objects, foo.so, bar.so and libc.so. The program makes two function calls, one to foo() and one to bar() located in it's dependent shared objects, by ldd'ing the executable we can see it's dependencies and a run of it shows the execution flow:
    % ldd prog
	    ./foo.so =>      ./foo.so
	    ./bar.so =>      ./bar.so
	    libc.so.1 =>     /lib/libc.so.1
	    libm.so.2 =>     /lib/libm.so.2
    % ./prog
    inside of foo
    inside of bar
Nothing too fancy really - but it's an example we can use to examine what bindings are going on. First - when the program prog makes reference to foo and bar - it's up to the runtime linker to find definitions for these functions and bind the program to them. First the runtime linker will load in the dependent shared objects (listed above) - as the objects are loaded into memory we create a Link Map entry for each object, the objects are appended onto a Link Map list in the order that they are loaded. In the case above the Link Map list would contain:
    prog -> foo.so -> bar.so ->libc.so.1 -> libm.so.2 -> libc_psr.so.1
When the runtime linker needs to find a definition for a symbol it starts at the head of the list and will search each object for that symbol. If it's found, it binds to that symbol - if it's not found it proceeds to the next object on the list. The following should help demonstrate what's happening. I will run the prog program, but with some runtime linker diagnostics turned on to trace what it is doing. I'm concentrating specifically on foo and bar for this example - of course there are thousands of other bindings going on:
    % LD_DEBUG=symbols,bindings ./prog
    20579: 1: symbol=foo;  lookup in file=./prog  [ ELF ]
    20579: 1: symbol=foo;  lookup in file=./foo.so  [ ELF ]
    20579: 1: binding file=./prog to file=./foo.so: symbol `foo'
    20579: 1: symbol=bar;  lookup in file=./prog  [ ELF ]
    20579: 1: symbol=bar;  lookup in file=./foo.so  [ ELF ]
    20579: 1: symbol=bar;  lookup in file=./bar.so  [ ELF ]
    20579: 1: binding file=./prog to file=./bar.so: symbol `bar'
Not so bad really, but it's really not the most efficient way to find a symbol is it. When we were looking for the symbol bar we had to go through 3 objects until we found it. Now imagine what happens when you have a more complex application which has many more shared objects with much larger symbol tables. If I look at firefox - I can see that has over 50 shared objects loaded:
    % pldd `pgrep firefox-bin`
    28294:  /disk3/local/firefox/firefox-bin /lib/libpthread.so.1
And on average - each of those objects have symbol tables with over 2,500 symbols. Doing a linear search at the beginning of each link-map list until you find the symbol just doesn't seem that practical anymore. Firefox is average for modern applications these days - if you were to take a look at Star Office you would find a single program which depends upon over 90 different shared objects.

There's got to be a better way, right? There is - we call it direct bindings. Instead of doing the linear search at runtime you can simply ask the link-editor to record not only what shared objects you bound against - but what symbols you obtained from each shared object. So, if you are bound with Direct Bindings, the runtime linker changes how it looks up symbol bindings and instead will bind directly to the object that offered the symbol at runtime. A much more efficient model - here's the same prog, but this time built with direct bindings, this is done by passing the -Bdirect link-editor option on the link-line:

    % cc -Bdirect -o prog prog.c ./foo.so ./bar.so
When you link with -Bdirect the link-editor will store additional information in a object including where each symbol was seen at link time. This can be viewed with elfdump as follows:
    % elfdump -y prog

    Syminfo Section:  .SUNW_syminfo
	 index  flgs         bound to           symbol
	  [15]  DBL      [1] ./foo.so           foo
	  [19]  DBL      [3] ./bar.so           bar
If we do the same experiment we did earlier, that of running the program and examining the actual bindings that the runtime linker is doing - we will see a much more efficient search:
    % LD_DEBUG=symbols,bindings ./prog
    20728: 1: symbol=foo;  lookup in file=./foo.so  [ ELF ]
    20728: 1: binding file=./prog to file=./foo.so: symbol `foo'
    20728: 1: symbol=bar;  lookup in file=./bar.so  [ ELF ]
    20728: 1: binding file=./prog to file=./bar.so: symbol `bar'
Notice we now find each symbol in the first object we look in, much better.

This Direct Bindings has been in Solaris for a few releases now, although because it's not the default not everyone is familiar with it. It has matured quite a bit over the last few years and we are now starting to use it for some of our core shared objects. If you look at the X11 shared objects delivered with Solaris - you'll find that they are bound with direct bindings:

    % elfdump -y /usr/lib/libX11.so | head

    Syminfo Section:  .SUNW_syminfo
         index  flgs         bound to           symbol
           [1]  D            <self>             _XimXTransDisconnect
           [2]  D        [8] libc.so.1          snprintf
           [3]  D            <self>             _XcmsFreeIntensityMaps
           [4]  D            <self>             _XcmsTableSearch
           [5]  D            <self>             _XDeq
	   [6]  D            <self>             XGetWMSizeHints
	   [7]  D            <self>             XUnmapWindow
Besides the fact that Direct Bindings are more efficient, they are also much more precise. It can get very tricky to control the name space when you start to combine all of the shared objects that you see in new modern applications. If two shared objects happen to offer a symbol of the same name (not by intention) using the default binding lookup - we'll bind to the first one found, which is probably not what the user intends. If - however we bind to exactly the version that was found at the time the object was built, there will be many fewer surprises.

Along these lines - it's worth giving a cautionary note for those re-linking their existing Applications with Direct Bindings enabled. As we apply Direct Bindings to more and more applications we have found a few cases where there are multiple definitions of a single symbol, by changing the binding model you can change the behavior of the application. In most, if not all cases, this was a bug in the design of the application - but a program can become dependent upon this and result in a failure of the application when run with Direct Bindings.

Further details on Direct Bindings specifically and the runtime linker (ld.so.1(1)) and link-editor (ld(1)) in general can be found in the Linker and Libraries Guide which is part of the standard Solaris Documentation.

Examples of tracing what the runtime linker is doing can found in a Blog entry by Rod here titled Tracing a link-edit.

Technorati Tag:
Technorati Tag:

e enjte Kor 22, 2004

How to build a Shared Library

It's very easy to build a Shared Library on Solaris (and \*nix in general). However, the Solaris link-editor (/usr/ccs/bin/ld)has quite a few options. It can sometimes be very confusing to determine which options to use. I'll try to lay out some general rules below.

Let the compiler do the work

First - do not invoke the link-editor (/usr/ccs/bin/ld) directly, instead run it via your compiler driver (cc/gcc/CC/g++). The compiler driver will include many magic files (crt\*.o, values-xa.o, ...) which are important to the construction of your shared library. If you invoke the link-editor directly you will likely miss those files and the benefits they give you. The first thing that will fail if you drop the comiler supplied files is that your _init and _fini routines will not run when the shared library is loaded.

Use PIC code

Compile code destined for the shared library as Position Independent Code. You do this by using the cc -Kpic or gcc -fpic options, depending upon which compiler you are using. PIC code is the most flexible and efficient code for a shared library because it permits the final library to be loaded at any address in memory without having to be modified at run-time. This permits the sharing of the library among all processes which have loaded it on a given system - hence the name Shared Library. If you include non-PIC code in a shared library, it will still run but you loose the share ability of the library. You can enforce that all code in the shared object is PIC by adding the -z text option to your link line.

List your dependencies

Always list all of your libraries dependencies (what you link against) when building the shared library. By listing them at link-edit time the built library will have recorded in it what libraries it needs at run-time. You can enforce this at build time by adding the -z defs option to your link line.

Include the proper RunPaths

By default the run-time linker (ld.so.1) will always search /lib & /usr/lib (or /lib/64 & /usr/lib/64 for 64bit processes) to find a shared library at run-time. If your library is referencing objects not in those directories you must also include a RunPath to find the dependencies. This is done by adding a -R {path} for each directory containing a shared libraries being referenced by the current object being built. You should \*not\* depend upon LD_LIBRARY_PATH being set at run-time to find shared libraries - Rod has done a very nice post explaining why this is so.


Following is a example link demonstrating all of the options I recommended above. It's much simpler then it sounds :)

  % cc -Kpic bar.c -G -o bar.so -z text -z defs -L /tmp/play/lib  \\
    -R /tmp/play/lib -lfoo -lc
  % ldd -r bar.so
    libfoo.so.1 =>   /tmp/play/lib/libfoo.so.1
    libc.so.1 =>         /lib/libc.so.1
    libm.so.2 =>         /lib/libm.so.2

Note that you can use ldd to verify that your shared library is self contained and that it has the proper RunPath's defined. If either of those hadn't been true then ldd would have given appropriate error messages.

The above is just the basics on building a shared library, if you just do the above you're off to a good start. In future posts I'll follow up with additional options you can apply, these may include topics on Symbol Scoping, Library Versioning, Lazy Loading, Direct Bindings, Relocation Optimization, ....

You can find out more details about off of this in The Linker and Libraries Guide. This is a great resource with additional details about all of the above and much more then you ever wanted to know about the Linkers.






« korrik 2016