Relocations - careful with that debugging flag

I received an application from a customer the other day. It's quite a big sucker, consisting of the application and over 70 shared objects (that's besides the system objects that also get used).

    % size -x main \*.so
    main:       2df35c + 2675a4 + 80918f8 = 0x85d81f8
    libxxx.so: 64d4d9c + 9af9f6 + 19604ba = 0x87e4c4c
    libyyy.so: 4db7aeb + 76aa4c + 32cc16c = 0x87ee6a3
    libzzz.so: 3f347ce + d8ebb1 + 4642a3b = 0x9305dba
    ....

The customer has complained that it takes a long time to load this application. In particular, it takes a long time to verify their objects using ldd(1) and the -r option. See the Solaris 10 man pages section 1: User Commands

Using ldd(1) and the -d option, emulates the cost of starting a process. The relocation processing is exactly the same as applied by ld.so.1(1), at runtime. Using the -r option processes all relocations as would be applied by ld.so.1(1) if the environment variable LD_BIND_NOW were in effect. Using ldd(1) and the -r option is a convenient way of testing that all symbol references can be found for a particular object.

The set of shared objects supplied by the customer do not specify any of their dependencies. In fact, the application seems responsible for establishing all dependencies, not only those that the application references, but also those needed to satisfy all dependencies. If each shared object defined their dependencies, then ldd(1) could be used on each object to validate its symbol requirements. However, with this set of objects, the only means of validating symbol requirements is to run ldd(1) against the application, and this is taking a long time.

A quick poke around with elfdump(1), reveals that there are over 3.3 million relocations to process for this application and dependencies. Some profiling (DTrace is your friend here), revealed that relocation processing is taking nearly 99% of the startup cost. Locating and mapping all the objects is trivial.

Looking a little deeper I found that around 2.3 million relocations are RELATIVE relocations. These are relocations that simply need the base offset of the object to be added to the relocation offset. This is a simple operation, involving no symbol lookup, and is only accounting for a few percent of the cost.

The rest of the startup cost stems from the symbolic relocations, of which there are some 740,000 that needed processing with ldd(1) and the -d option. Poking some more revealed that 680,000 of these symbols are of the form $XBGEQEsZEHHBGQS.... (it's the $X prefix that's the give away). These are local symbols that have been made unique and promoted to global symbols by the compilers when the -g (debugging) option is used. By all accounts, they allow for the debuggers to provide fix-and-continue processing, or, if you're compilers have this capability, inter-object optimization.

I don't have the source for this set of objects to experiment with not using -g. But I'm left concluding that the bulk of the startup cost of this process is due to these $X.... symbols.

If you don't want fix-and-continue, be careful how you use the compiler flags. This overhead in relocation processing probably isn't what you want in production software.

Note, you can also build objects with the -zcombreloc flag of ld(1). This option combines relocation sections into one table. The RELATIVE relocations are all concatenated, and represented by dynamic entries that allow the runtime linker to process them through an optimized loop, that is even faster than normal relative relocation processing.

Comments:

Interesting. On a related note, I've always wondered what the overhead of the incremental linking option of the Forte compilers is - I always use -xildoff, although it shouldn't be necessary unless you are using -g as well (I'm clearly paranoid ;-). There's some info in http://docs.sun.com/source/817-5064/ild.html, but not enough to say exactly what impact ild has on the subsequent ld.so processing.

Posted by Alan Burlison on August 30, 2004 at 07:30 PM PDT #

I thought an ild() expert might have answered this, guess none of them read my blog.

I don't know all that much about ild(), as it's delivered by the compiler folks. ld(1) comes with the OS. I know it leaves large holes in the object being created, which allow new .o's to be stitched in later. I'm not sure that the symbol table is polluted with promoted local symbols to accomplish this. I think there's a host of other metadata maintained to support the process.

But, ild() might not be around much longer. From the Studio 9 Release notes:

    The Incremental Link Editor (ILD) is a special-purpose
    linker that can, in limited situations, perform program
    linkage faster than the general-purpose system linker ld.
    This feature might be removed in a future release.
    When ILD is no longer available, ld will be used instead.

Posted by Rod Evans on September 13, 2004 at 01:37 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

user12613883

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today