Monday Aug 02, 2010

Symbol Capabilities

In a previous posting I covered the use of filters, especially defining symbol specific filtering. Filters allows the redirection of a binding at runtime to an alternative filtee. This technique has been used to provide optimized, platform specific, instances of functions. For example, libc.so.1 has provided a number of platform specific filtees, libc_psr.so.1, that provide optimized versions of the memmove() family. ldd(1) reveals these filtees.
  % ldd /bin/date
        libc.so.1 =>     /lib/libc.so.1
        libm.so.2 =>     /lib/libm.so.2
        /platform/SUNW,Sun-Fire-880/lib/libc_psr.so.1

In the same posting, I also touched on the use of Hardware Capabilities as a means of identifying the requirements of a filtee, and how these can be employed to provide a filtering mechanism.

Object filters do have some downsides. There's overhead involved in locating the filtees, and often a maintenance burden of providing a complex symlink hierarchy to provide the specific filtee instances.

The Solaris link-editors have now been updated to provide for multiple instances of a function to exist within the same dynamic object. Each instance of the function is associated with a group of capabilities. During process execution, the runtime linker can select from a family of symbol instances, the one whose capabilities are best represented by the present system.

This new mechanism, termed Symbol Capabilities, provides filtering within the same object, in contrast to the traditional mechanism that provides filtering by selecting from a collection of external objects.

Symbol capabilities have the advantage of providing a more light weight infrastructure than the existing filtee objects. The runtime cost of searching for, and selecting the best function, is less than searching for external objects. In addition, there's no need to maintain any symlink hierarchy to provide the specific filtee instances. Symbol capabilities can also replace the ad-hoc techniques users have employed to vector to their own family of function interfaces.

Symbol capabilities have been used to re-architect the filtering mechanism of many Solaris shared objects, including libc.so.1 on SPARC. For example, libc.so.1 now contains the following function instances.

  % elfdump -H /lib/libc.so.1

  Capabilities Section:  .SUNW_cap

   Symbol Capabilities:
       index  tag               value
         [1]  CA_SUNW_ID       sun4u
         [2]  CA_SUNW_MACH     sun4u

    Symbols:
       index    value      size      type bind oth ver shndx   name
         [1]  0x000f0940 0x000000bc  FUNC LOCL  D    0 .text   memmove%sun4u
         [7]  0x000f1e0c 0x00001b28  FUNC LOCL  D    0 .text   memcmp%sun4u
        [11]  0x000f09fc 0x00001280  FUNC LOCL  D    0 .text   memcpy%sun4u
        [17]  0x000f1c80 0x0000018c  FUNC LOCL  D    0 .text   memset%sun4u

   Symbol Capabilities:
       index  tag               value
         [4]  CA_SUNW_ID       sun4u-opl
         [5]  CA_SUNW_PLAT     SUNW,SPARC-Enterprise

    Symbols:
       index    value      size      type bind oth ver shndx   name
         [2]  0x000f3940 0x00000310  FUNC LOCL  D    0 .text   memmove%sun4u-opl
         [8]  0x000f458c 0x00000120  FUNC LOCL  D    0 .text   memcmp%sun4u-opl
        [12]  0x000f3c80 0x0000076c  FUNC LOCL  D    0 .text   memcpy%sun4u-opl
        [18]  0x000f4400 0x0000018c  FUNC LOCL  D    0 .text   memset%sun4u-opl
  ....

Each of these functions provides an optimized instance that is associated to a specific system. Note, that the capability identifiers have been expanded from the original hardware capabilities (CA_SUNW_HW_1) and software capabilities (CA_SUNW_SF_1) definitions, and now provide for platform identifiers (CA_SUNW_PLAT) and machine identifiers (CA_SUNW_MACH). In addition, the capabilities group can be labeled with its own identifier (CA_SUNW_ID), which in turn is used to name the function instances.

There still must exist generic interfaces for each symbol family, so libc.so.1 contains a basic memmove(), memcpy(), etc.

      [1569]  0x000433e4 0x000001f4  FUNC GLOB  D   41 .text   memmove
      [1860]  0x000431cc 0x000001b0  FUNC GLOB  D   41 .text   memcpy

However, these generic symbols lead the family of instances, which are maintained as a chain within the capabilities data structures. During process execution, the runtime linker traverses the chain of family instances and selects the best instance for the present system.

  % elfdump -H /lib/libc.so.1
  ....
  Capabilities Chain Section:  .SUNW_capchain

   Capabilities family: memmove
     chainndx  symndx      name
            1  [1569]      memmove
            2  [1]         memmove%sun4u
            3  [2]         memmove%sun4u-opl
            4  [3]         memmove%sun4u-us3-hwcap1
            5  [4]         memmove%sun4u-us3-hwcap2
            6  [5]         memmove%sun4v-hwcap1
            7  [6]         memmove%sun4v-hwcap2
  ....

At runtime, you can observe which family instance is used.

  % LD_DEBUG=cap foo
  14507:   symbol=memmove[1569]:  capability family default
  14507:   symbol=memmove%sun4u[1]:  capability specific (CA_SUNW_MACH):  [ sun4u ]
  14507:   symbol=memmove%sun4u[1]:  capability candidate
  14507:   symbol=memmove%sun4u-opl[2]:  capability specific (CA_SUNW_PLAT):  [ SUNW,SPARC-Enterprise ]
  14507:   symbol=memmove%sun4u-opl[2]:  capability rejected
  ....
  14507:   symbol=memmove%sun4u-us3-hwcap1[3]:  capability specific (CA_SUNW_PLAT):  [ SUNW,Sun-Blade-1000 ]
  14507:   symbol=memmove%sun4u-us3-hwcap1[3]:  capability candidate
  ....
  14507:   symbol=memmove%sun4u-us3-hwcap1[3]:  used

In the future, it should be possible for a compiler to create various instances of a function and pass these to the link-editor. By embedding the necessary capabilities information in the object file created, the link-editor can create the necessary symbol families, and the appropriate information to provide to the runtime linker.

For now, it is also possible to create various instances using the link-editor directly. First, you need to identify a relocatable object with the capabilities it requires. This can be achieved by linking a relocatable object with a mapfile. In the following example, foo.c is built to use, and identify its requirement on, a sun4v system.

  % cc <options to trigger sun4v specific optizations> -c -o foo.o foo.c
  % cat mapfile-cap
  $mapfile_version 2

  CAPABILITY sun4v {
          MACHINE = sun4v;
  };

  SYMBOL_SCOPE {
          global:
                foo;
                bar;
          local:
                *;
  };
  % ld -r -o objcap.o -Mmapfile -Breduce foo.o
  % elfdump -H objcap.o

  Capabilities Section:  .SUNW_cap

   Object Capabilities:
       index  tag               value
         [0]  CA_SUNW_ID       sun4v
         [1]  CA_SUNW_MACH     sun4v

  % elfdump -s pics/objcap.o | fgrep foo
        [87]  0x00000000 0x000000bc  FUNC GLOB  D    1 .text   foo
        [93]  0x00000120 0x000000a0  FUNC GLOB  D    1 .text   bar

This capabilities object can then be translated so that each global symbol is associated with the defined capabilities.

  % ld -r -o symcap.o -z symbolcap objcap.o
  % elfdump -H symcap.o

  Capabilities Section:  .SUNW_cap

   Symbol Capabilities:
       index  tag               value
         [1]  CA_SUNW_ID       sun4v
         [2]  CA_SUNW_MACH     sun4v

    Symbols:
       index    value      size      type bind oth ver shndx   name
        [87]  0x00000000 0x000000bc  FUNC LOCL  D    0 .text   foo%sun4v
        [93]  0x00000120 0x000000a0  FUNC LOCL  D    0 .text   bar%sun4v

Note, that this translation converts each global symbol to a local symbol, renames the symbol using the capabilities identifier, and leaves a symbol reference to the original symbol.

        [87]  0x00000000 0x000000bc  FUNC LOCL  D    0 .text   foo%sun4v
        [93]  0x00000120 0x000000a0  FUNC LOCL  D    0 .text   bar%sun4v
       [101]  0x00000000 0x00000000  FUNC GLOB  D    0 UNDEF   foo
       [102]  0x00000000 0x00000000  FUNC GLOB  D    0 UNDEF   bar

This object can now be combined into a final dynamic object together with a generic instance of the symbol family to lead the capabilities family.

Debugging Capabilities

Along with these capabilities updates, the runtime linker has also been enhanced to provide an environment for capabilities experimentation. If the previous examples were combined into a shared object, libfoo.so.1, and this object is executed on a sun4v system, then the foo%sun4v() instance is be bound to and used at runtime. However, you can establish an alternative capabilities environment, by removing or setting capabilities, along with identifying the object to which the alternative capabilities should be applied.

To exercise the generic version of foo() from libfoo.so.1, while executing on a sun4v platform, you can set the following environment variables.

  %  LD_CAP_MACH= LD_CAP_FILES=libfoo.so.1 <app>

The use of LD_CAP_FILES isolates the alternative capabilities to the object identified, rather than to every object within the process. With this mechanism, you can exercise a family of capability instances on a machine that provides all required capabilities.



Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Monday Jul 21, 2008

Direct Binding - the -zdirect/-Bdirect options, and probing

In a previous posting I introduced the use of direct bindings within the OSNet consolidation. A comment to this posting questioned the difference between the two options -z direct and -B direct, and pointed out that runtime errors can occur during process execution if a lazy dependency (typically enabled with -B direct) can not be found. In this entry, I'll discuss the difference between the -z direct and -B direct options, and offer a useful technique for handling the case where the lazy dependency is not present at runtime.

First, the difference between -z direct and -B direct. A full discussion of these options can be found in the Direct Binding Appendix of the Linker and Libraries Guide. Aside from lazy loading being enabled by -B direct, the essential difference between these options is a trade off between ease of use, and of control. -B direct can be specified anywhere on the command-line, and results in any external and internal symbol bindings being established as direct. This means that if libX.so defines xy() and references xy(), then a direct binding will be established within the same object.

    % cc -G -o libxy.so xy.c -Bdirect -Kpic
    % elfdump -y libxy.so | fgrep xy
          [7]  DB          <self>             xy

Hence, -B direct is a blunt club that hits everything. In contrast, -z direct is sensitive to its position in the command line, and can therefore be used in a more precise manner. Only external references that are resolved to dependencies that follow -z direct are established as direct. In the following example, only the references to libX.so and libY.so will have direct bindings established.

    % cc -o libxy.so xy.c -lA -lB -z direct -lX -lY

But the real question is why would you use one option over the other? -B direct is recommended where possible, due to its simplicity and ease of use. However, there are cases where finer grained control is needed, and -z direct is more appropriate. One example is libproc. This library contains many routines that users (typically debugging tools) wish to interpose upon. We want libproc to have direct bindings to any of the dependencies it requires (libc, libelf, etc.), but we do not wish libproc to directly bind to itself. Therefore, but using -z direct we can build libproc to bind directly to its own dependencies while freely binding to any interposers, for any of the interfaces libproc defines. This interposition is provided regardless of the interposers being explicitly defined (a requirement as we do not have control over all the consumers of libproc). Note, we even went a little bit further, and defined all the libproc interfaces as NODIRECT, which prevents any direct binding to libproc. This was to prevent any dependencies binding to libproc instead of to an interposer.

The comment to my previous blog entry also raised the issue of how lazy loading can be compromised if a lazy dependency can not be found. Typically, lazy loading is used to locate dependencies that are expected to exist. Historically, interfaces like dlopen(3c) have been used to test for the occurrence of dependencies that might not exist. However, a useful technique is to use lazy loading and test for the existence of a dependency with dlsym(3c). By testing for the existence of a known interface with a lazy dependency you can verify the dependency exists and then feel free to call any other interface within that dependency.

When a dependency is bound to, the SONAME of that dependency is recorded in the caller.

    % cc -G -o libxy.so -hlibxy.so xy.c -Kpic
    % elfdump -d libxy.so | fgrep SONAME
	 [2]  SONAME		0x1		    libxy.so

    % cc -o main main.c -z lazyload -L. -lxy
    % elfdump -d main | egrep "NEEDED|POSFLAG"
	 [0]  POSFLAG_1		0x1		    [ LAZY ]
	 [1]  NEEDED		0x163		    libxy.so

With this dependency established, you can protect yourself from calling the interfaces within the dependency unless the interface family you are interested in are known to exist.

    if (dlsym(RTLD_PROBE, "symbol-in-libxy-1") {
	/*
         * feel free to call any-and-all interfaces in libxy
	 */
	symbol-in-libxy-1();
	symbol-in-libxy-2();
	....

With this model you don't need to know the name of the object that provides the interfaces, as the name was recorded at link-time. And, the dlsym() will trigger an attempt to load the dependency associated with the symbol. All other references can be made directly through function calls rather than through dlsym(). This allows the compiler, or verification tools like lint, to ensure that you are calling the function with the proper argument and return types, and will therefore lead to safer and more robust code.

The use of dlopen() is still appropriate for selecting between differing objects, or when the caller is not knowledgeable of the dependency, such as the case with plugins. In other cases, the use of lazy loading together with dlsym(), as outlined above, is recommended, as the implementation is usually easier to write, debug and deploy.



Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Wednesday May 14, 2008

Direct Binding - now the default for OSNet components

Direct Binding refers to a symbol search and binding model that has been available in Solaris for quite some time. See Library Bindings.

At runtime, a symbol reference from an object must be located by the runtime linker (ld.so.1(1) ). Under direct bindings, symbol definitions are searched for directly in the dependency that provides the symbol definition. The provider of the symbol definition was determined by the link-editor (ld(1)) when the object was originally built.

This direct binding model differs from the traditional symbol search and binding model. In the traditional model, the symbol search starts with the application and advances through each object that is loaded within the process until a symbol definition is found.

Given that direct binding capabilities have been available for some time, and a number of other consolidations have been happily using them, why did it take so long to get this model employed to build the OSNet consolidation? (that's the Solaris core OS and networking).

Basically, there were a number of corner cases to solve. One advantage of direct bindings is that this model can protect against unintentional interposition. One disadvantage of direct bindings is that this model can circumvent intentional interposition. Determining whether interposition exists, and whether it is intentional or unintentional is the fun part. The core Solaris libraries seem to be a frequent target of interposition.

So first, what is interposition? Suppose a process is made up of several shared objects, and two shared objects, libX.so and libY.so, export the same symbol xy(). Under the traditional symbol search model, any references to the symbol xy() will be bound to the first instance of xy() that is found. So, if libX.so is loaded before libY.so , then the instance of xy() within libX.so is used to satisfy all references. The instance of xy() within libX.so is said to interpose on the instance in libY.so.

Now, suppose that two other shared objects within the process, libA.so and libB.so, reference xy(). Under the traditional symbol search model, both of these objects will bind to libX.so. But, if libA.so was built to depend on libX.so, and libB.so was built to depend on libY.so, and both employed direct bindings, then libA.so would bind to xy() in libX.so, and libB.so would bind to xy() in libY.so.

One avenue to observe this difference in binding is to employ lari(1), a utility that looks for interesting binding events. Not surprisingly, most interesting events revolve around the multiple instance of a symbol. From our example, the traditional symbol search model will reveal:

    % lari main
    [2:2E]: xy(): ./libX.so
    [2:0]: xy(): ./libY.so

Here, we see the two instances of xy(), with libX.so being the recipient of the two external bindings (2E).

However, if libA.so and libB.so employ direct bindings then the symbol search model will reveal:

    % lari main
    [2:1ED]: xy(): ./libX.so
    [2:1ED]: xy(): ./libY.so

Here, both libX.so and libY.so are the recipient of one external, direct binding (1ED).

The question now is what did the developer of libX.so intend? Did they want to capture all bindings to xy()?, or was their choice of the name xy() an unintended name-clash with the existing symbol in libY.so?

It is this latter name-clash issue that was one of the main motivators in having the OSNet consolidation use direct bindings for all system libraries. There have been numerous instances of user applications breaking system functionality by unintentionally interposin g on a symbol that exists within a system library. However, although we wished to protect our libraries from unintentional interposition, we still wished to provide for interposition where it was intended.

Although the direct bindings implementation prevents unintentional interposition , the implementation does allow for interposition. However, if you want interposition then you now need to be explicit. Explicit interposition can be achieved with LD_PRELOAD (an old favorite), or by tagging the associated object with -z interpose, or by identifying symbols within an executable with INTERPOSE mapfile directives.

Alternatively, if you design a library with the intent that users be allowed to interpose on symbols within the library, you can disable direct binding to the library. Disabling can be achieved for the whole library using the link-editors -B nodirect option, or by identifying individual symbols with NODIRECT mapfile directives or as singletons.

If you suspect an issue with direct bindings in effect, you can return to the tradition symbol search model by setting the environment variable LD_NODIRECT=yes. A suggestion for investigating the issue further would be:

    % lari main > direct
    % LD_NODIRECT=yes lari main > no-direct
    % diff direct no-direct

Standard interposition dates from an era where applications had very few dependencies. Times have changed, and the number of dependencies have dramatically increased. Although interposition can be powerful, it can also be fragile and scale badly. Diagnosing the occurrence of interposition can be a challenge.

Given the ability to time travel, direct binding would probably have been the only model for symbol binding, and explicit interposition the only means of defining an interposer. Having to support direct bindings and the traditional model with the various flags and options is the cost of backward compatibility. However, the ability of ELF to stretch this far speaks to the overall quality of its initial design, warts and all.

The OSNet consolidation uses the various binding-control flags to both identify interposers, and prevent direct bindings to commonly interposed upon symbols. All the gory details of direct binding, the various flags that can be used, and examples of their use, can be found in the Direct Binding Appendix of the Linker and Libraries Guide.



Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Monday Jun 25, 2007

We've moved - /usr/ccs/bin commands, that is

A recent update to Solaris Nevada (build 68 to be precise) has moved the /usr/ccs/bin utilities to /usr/bin. This move includes the link-editor, and associated utilities.



Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Tuesday Dec 19, 2006

'_init'/'_fini' not found - use the compiler drivers

A recently added error check within ld(1) has uncovered the following condition.

    ld: warning: symbol `_init' not found, but .init section exists - \\
        possible link-edit without using the compiler driver
    ld: warning: symbol `_fini' not found, but .fini section exists - \\
        possible link-edit without using the compiler driver

The encapsulation, and execution of .init and .fini sections is a combination of user definitions and compiler driver files.

Users typically create these sections using a pragma. For example, the following code produces a .init and .fini section.

    % cat foobar.c
    static int foobar = 0;

    #pragma init (foo)

    void foo()
    {
            foobar = 1;
    }

    #pragma fini (bar)

    void bar()
    {
            foobar = 0;
    }

The functions themselves are placed in a .text section, and a call to foo() is placed in a .init section, and a call to bar() is placed in a .fini section. So, how do these functions get called in the runtime environment?

This is where the compiler drivers come in. As part of creating a dynamic object, the compiler drivers provide input files that encapsulate the .init and .fini calls. This encapsulation effectively creates two functions that are labeled _init and _fini respectively.

    _init {      # provided by .init in crti.o
    call foo()   # provided by .init in foobar.c
    }            # provided by .init in crtn.o

    _fini {      # provided by .init in crti.o
    call bar()   # provided by .init in foobar.c
    }            # provided by .init in crtn.o

It is the symbols _init and _fini that are recognized by ld() and are registered within the object so that the two addresses are called at runtime.

Some folks are using ld() directly to build their shared objects, and thus the encapsulating crt files aren't being included within the link-edit. The result is that even though .init and .fini sections may exist, no _init and _fini encapsulation occurs, and no symbols are registered with the object for runtime execution.

This leaves the developer wondering why their .init and .fini code is never executed, and is the rational behind us adding the warning message.

It's best not to use ld() directly to build any executables or shared objects, let the compiler drivers do it for you.



Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Friday Oct 13, 2006

Displacement Relocation Warnings - what do they mean?

There have been a couple postings recently regarding relocation warnings that have been observed when using the link-editors -z verbose option. The first warning originates from building a shared object:

    ld: warning: relocation warning: R_SPARC_DISP32: file shared.o: \\
        symbol <unknown>: displacement relocation applied to the \\
        symbol __RTTI__1nEBase_: at 0x8: displacement relocation will \\
        not be visible in output image

Then, if this shared object is referenced as a dependency when building a dynamic executable, another warning can be generated:

    ld: warning: relocation warning: R_SPARC_COPY: file shared.so: \\
        symbol xxxx: may contain displacement relocation

These warnings stem from an old request from the compiler folks to help prevent problems with displacement relocations and copy relocations.

You have to be a little relocation savvy to understand these scenarios - they make my head hurt. Investigations are underway to determine why these warnings are starting to surface.



Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Dynamic Object Versioning - specifying a version binding

After reading a previous posting on versioning, a developer asked how they could specify what version to bind to when they built their application. For example, from the version definitions within libelf.so.1:

    % pvs -d /lib/libelf.so.1
        libelf.so.1;
        SUNW_1.5;
        SUNW_1.4;
        ....
        SUNWprivate_1.1;

how could you restrict an application to only use the interfaces defined by SUNW_1.4. Note, version SUNW_1.4 inherits previous versions.

The Linker and Libraries Guide covers this topic in the section Specifying a Version Binding. In a nutshell, you can specify a version control mapfile directive:

    % cat mapfile
    libelf.so - SUNW_1.4;

Notice that the shared object name is the compilation environment name. This is the name that gets resolved when you specify -lelf. By adding this mapfile to your link-edit, the link-editor will restrict the interfaces you are able to bind to from libelf to those provided by version SUNW_1.4.

Note, if you build against libelf.so.1 and discover a dependency on SUNW_1.5, then you are referencing interfaces from the SUNW_1.5 version. These will show up as undefined errors if you build an application using the version control mapfile directive. You'll have to recode your application to not use the SUNW_1.5 interfaces.

For example, this application is referencing gelf_getcap(3ELF).

    % pvs -r prog
        libelf.so.1 (SUNW_1.5);
        ....
    % pvs -dos /lib/libelf.so.1 | fgrep SUNW_1.5
        /lib/libelf.so.1 -      SUNW_1.5: gelf_getcap;
        /lib/libelf.so.1 -      SUNW_1.5: gelf_update_cap;

Note also, binding to specific versions is not a panacea for building software on release N and running on N-1. Other factors can affect your build environment, such as headers. It is always safest to build your software on the oldest platform on which it is intended to run.

Of course, building on the latest release can provide a richer debugging environment in which to develop your software. I often try building things on the latest environment, and then fall back to the oldest environment for final testing and for creating final deliveries.



Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Wednesday Oct 04, 2006

Changing Search Paths with crle(1) - they are a replacement

A developer who wished to add /usr/sfw/lib to their default runtime search path, managed to turn their system into a brick by using crle(1):

    # crle -l /usr/sfw/lib
    # ls
    ld.so.1: ls: fatal: libsec.so.1: open failed: No such file or directory
    Killed

The problem was that crle(1),in this basic form, created a system wide configuration file. This configuration file defined that the default runtime search path for shared object dependencies is /usr/sfw/lib. This search path definition had replaced the standard defaults.

You can determine the standard search path defaults using crle(1).For example, without any system wide configuration file, the following defaults might exist:

   $ crle

   Default configuration file (/var/ld/ld.config) not found
     Platform:     32-bit LSB 80386
     Default Library Path (ELF):   /lib:/usr/lib  (system default)
     Trusted Directories (ELF):    /lib/secure:/usr/lib/secure  (system default)

This user had effectively removed the system default search paths, and hence the runtime linker, ld.so.1,had been unable to find the basic dependencies required by all applications. The new configuration file revealed:

   $ crle

   Configuration file [version 4]: /var/ld/ld.config
     Platform:     32-bit LSB 80386
     Default Library Path (ELF):   /usr/sfw/lib
     Trusted Directories (ELF):    /lib/secure:/usr/lib/secure  (system default)

   Command line:
     crle -c /var/ld/ld.config -l /usr/sfw/lib

The -l option allows you to define new search paths. However, rather than dictate that a new search path definition be prepended, inserted, or appended to any existing search paths, crle(1) simply replaces any existing search paths. The man page spells this out in some detail:

    -l dir
      ....
      Use of this option replaces  the  default  search  path.
      Therefore,  a  -l option is normally required to specify
      the original system default in relation to any new paths
      that are being applied. ....

Therefore, to prepend the new search path to the existing defaults you should specify each search path:

    # crle -l /usr/sfw/lib -l /lib -l /usr/lib
    # ls
    devices/        lib/            proc/
    ....

An alternative is to use the -u, update, option. Any new search paths supplied with crle(1) are appended to any existing search paths. Even if an existing configuration file does not exist, the -u option causes any new search paths to be appended to the system defaults:

   # crle -u -l /usr/sfw/lib
   # crle

   Configuration file [version 4]: /var/ld/ld.config
     Platform:     32-bit LSB 80386
     Default Library Path (ELF):   /lib:/usr/lib:/usr/sfw/lib
     Trusted Directories (ELF):    /lib/secure:/usr/lib/secure  (system default)

   Command line:
     crle -c /var/ld/ld.config -l /lib:/usr/lib:/usr/sfw/lib

Note that the usage message from crle(1) is a little misleading, as it implies that the new search path is an addition:

   # crle -X
   crle: illegal option -- X
       ....
       [-l dir]        add default search directory
       ....

We'll get the usage message updated to be more precise.

Remember, should you ever get in trouble with crle(1) configuration files, you can always instruct the runtime linker to ignore processing the configuration file by setting the environment variable LD_NOCONFIG=yes:

   # crle -l /does/not/exist
   # ls
   ld.so.1: ls: fatal: libsec.so.1: open failed: No such file or directory
   Killed
   # LD_NOCONFIG=yes ls
   devices/        lib/            proc/
   ....
   # LD_NOCONFIG=yes rm /var/ld/ld.config
   # ls
   devices/        lib/            proc/
   ....

It is recommended that when creating a new configuration file, you first create the file in a temporary location. The environment variable LD_CONFIG can then be set to this new configuration file. Refer to the crle(1) man page for an example.

Note. crle(1) should not be crippled by blowing away the system default search paths:

   # crle -l /does/not/exist
   # crle

   Configuration file [version 4]: /var/ld/ld.config
     Platform:     32-bit MSB SPARC
     Default Library Path (ELF):   /does/not/exist
     Trusted Directories (ELF):    /lib/secure:/usr/lib/secure  (system default)

   Command line:
     crle -c /var/ld/ld.config -l /does/not/exist

   # elfdump -d /usr/bin/crle | fgrep RPATH
   ld.so.1: fgrep: fatal: libc.so.1: open failed: No such file or directory
   ksh: 18184 Killed
   # LD_NOCONFIG=yes; export LD_NOCONFIG
   # elfdump -d /usr/bin/crle | fgrep RPATH
      [6]  RPATH             0x61b               $ORIGIN/../lib

Using $ORIGIN within a runpath provides crle(1) with a level of protection against insufficient configuration file information.



Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Wednesday Apr 26, 2006

Wrong ELF Class - requires consistent compiler flags

Every now and then, someone encounters the following error.

  % cc -G -o foo.so foo.o -lbar
  ld: fatal: file foo.o: wrong ELF class: ELFCLASS64
  ld: fatal: File processing errors. No output written to foo

Or perhaps the similar error.

  % cc -G -xarch=amd64 -o foo.so foo.o -lbar
  ld: fatal: file foo.o: wrong ELF class: ELFCLASS32
  ld: fatal: File processing errors. No output written to foo

This issue stems from the compiler flags that have been used to compile the relocatable object foo.o, and the compiler flags that are finally used to invoke the link-edit of this object.

The man page for ld(1) hints at the issue.

  No command-line option is required to distinguish 32-bit
  objects or 64-bit objects. The link-editor uses the ELF
  class of the first relocatable object file that is found on
  the command line, to govern the mode in which to operate.

When the compiler drivers are used to generate an executable or shared object, the driver typically supplies a couple of their own files to the link-edit. One or more of these additional files will be read by the link-editor before the file foo.o. Expanding the compiler processing might reveal:

  % cc -# -G -o foo.so foo.o
  ...
  ld crti.o values-xa.o -o foo.so -G foo.o ... crtn.o

Here, the first input file read by the link-editor is crti.o (this is typically a full path to a compiler specific subdirectory). Expanding a 64-bit link-edit request might reveal:

  % cc -# -xarch=64 -G -o foo.so foo.o
  ...
  ld amd64/crti.o amd64/values-xa.o -o foo.so -G foo.o ... amd64/crtn.o

Armed with this information it should be easy to see how the ELFCLASS error messages can be produced. If for example, you wish to create a 64-bit shared object from one or more relocatable objects, you might first create the 64-bit relocatable object like:

  % cc -c -xarch=amd64 foo.c
  % file foo.o
  foo.o:       ELF 64-bit LSB relocatable AMD64 Version 1

But, if you fail to inform the compiler driver that this object should be linked into a 64-bit object, you'll produce the ELFCLASS64 error message. The first file read by the link-editor will be the 32-bit version of crti.o. This puts ld() into 32-bit mode, and hence when foo.o is read it will be rejected as being incompatible with the mode of the link-edit requested.

Similarly, a 32-bit relocatable object:

  % cc -c foo.c

that is handed to a 64-bit link-edit will produce the ELFCLASS32 error message.

Make sure that the architecture flag used to build a relocatable object is also passed to the compiler driver phase of linking the relocatable object into a final executable or shared object.



Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Thursday Mar 16, 2006

C++ Dynamic Linking - symbol visibility issues

Recently, a customers use of C++ objects within a dlopen(3c) environment revealed a problem that took some time to evaluate and understand. Sadly, this seems to be a recurring issue where the expectations of the C++ implementation are compromised by dynamic linking capabilities. Of course, dynamic linking is the norm for Solaris, and C++ is commonly employed in dynamic linking environments. But there are subtleties in regards symbol visibility that can cause problems.

This customer was using a java application to System.loadLibrary a C++ shared object, built to use standard iostreams. The underlying dlopen() failed as part of calling _init, and the result was a core dump. By preloading libumem(3lib), the customer discovered the problem was a bad free().

   >::umem_status
   Status:         ready and active
   Concurrency:    4
   Logs:           (inactive)
   Message buffer:
   free(d352a040): invalid or corrupted buffer

There seemed to be an inconsistency in memory allocation underlying this failure. And, I felt I'd been here before. A similar (but slightly different as it turns out) problem had been uncovered a few months ago. So, I stated poking through the symbol bindings for this process. I do this for a living, but even I find analyzing the symbol bindings of a process to be a little daunting. There are just so many bindings to wade through. In Solaris 10 we invented lari(1) to help uncover interesting symbol bindings. I gave a quick introduction to this tool in a previous posting.

First it was necessary to obtain a trace of all process bindings, including those produced by the dlopen(). The following environment variables result in this trace being saved in the file dbg.pid.

    % LD_DEBUG=files,detail LD_DEBUG_OUTPUT=dbg java-app

The interesting information that lari() unravels focuses on the existence of multiple instances of the same symbol name. But even this can be a lot of information to digest (although I still don't understand why so many objects export the same interfaces). For this application, I wanted to narrow things down to just those symbols that were involved in a runtime binding. And, as we're dealing with C++, a little bit of demangling might be useful too.

    % lari -bC -D dbg.pid
    [3:1EP]: __1cDstdMbasic_string4Ccn0ALchar_traits4Cc__n0AJallocator4Cc___J__nullref_[0x30] \\
        [std::basic_string <char,std::char_traits <char>,std::allocator <char>>::__nullref]: \\
        /local/ISV/libdlopened.so
    [3:1SF]: __1cDstdMbasic_string4Ccn0ALchar_traits4Cc__n0AJallocator4Cc___J__nullref_[0x30] \\
        [std::basic_string <char,std::char_traits <char>,std::allocator <char>>::__nullref]: \\
        /usr/lib/cpu/sparcv8plus/libCstd_isa.so.1
    .....

Now that's interesting. Here we have three occurrences of the same __nullref_ symbol, and two different instances have been bound to. The libdlopened version is also defined as protected, which means that there may be internal references to this symbol from within the same object. A quick inspection of the original process bindings for this symbol also uncovers their addresses.

    09268: 1: binding file=/usr/lib/libCstd.so.1 (0xd1677b00:0x177b00) to \\
        file=/local/ISV/libdlopened.so (0xd352a040:0x192a040): \\
        symbol `__1cDstdMbasic_string4Ccn0ALchar_traits4Cc__n0AJallocator4Cc___J__nullref_'

There's that bad free() address, 0xd352a040.

Now I'm not sure why the C++ implementation is trying to free a data item that exists within an object, but the core of the problem (I'm told) is that there are two instances of __nullref_ being used, and this has led to confusion. But why have we bound to two different instances?

The problem seems to stem from the search scope and visibility of the objects loaded with dlopen(). Refer to the section "Symbol Lookup" under "Runtime Linking Programming Interface" for a detailed explanation.

By default, a dlopen() family is loaded with the RTLD_LOCAL attribute. In this customers application, libdlopened.so is loaded by the dlopen(), and libCstd.so.1 is loaded as one of the dependencies. libCstd.so.1 is not a dependency of the java application itself. Therefore libCstd.so.1 is maintained within the local scope of the family of dlopen objects. All objects within this family are able to bind to this dependency. Objects outside of this family can not. But, libCstd.so.1 also acts as a filter, and brings in the filtee libCstd_isa.so.1. This filtee is effectively brought in using another dlopen(), and thus libCstd_isa.so.1 exists within its own local scope. Hence, the __nullref_ reference from libCstd_isa.so.1 can not be satisfied by the definition in libdlopened.so - the referring object, and the defining object, live in different local scopes. Hence we get two different symbol bindings.

Sadly, this seems to be a common failure point. The C++ implementation can deposit the same data item in multiple objects. However, the design expects all such objects to be of global scope, such that interposition occurs, and only one definition from the multiple symbols is bound to. This requirement can be undermined by a number of dynamic linking techniques.

The first is the local scope families produced by dlopen() and filters as shown by this customers scenario - although both of these techniques have been around since the early days of Solaris. It is possible that scenarios like this are typically avoided because the application maintains its own dependency on the C++ libraries, or dlopen() is employed with the RTLD_GLOBAL flag. The scenario can also be avoided by preloading the C++ library. All these mechanisms force the C++ library to be of global scope, and hence allow interposition to bind to one instance of the problematic symbol. (Another hack for this scenario is to set LD_NOAUXFLTR=yes, which suppresses auxiliary filtering - hence libCstd_isa.so.1 wouldn't get loaded).

However, similar issues can result from using linker options such as -Bsymbolic, and direct bindings, or scoping dynamic object interfaces using mapfiles. The problem is that the dynamic linking technologies exist to carve out local namespaces within a process, and protect multiple dlopen() families from adversely interacting with one-another. A requirement that is becoming more and more relevant in todays large dynamic applications.

C++ implementation requirements, and user dynamic linking requirements seem to be a odds.

Perhaps it is time to invent a new symbol attribute. Attributes that allow symbols to be demoted to protected, or local scope already exists. A previous posting introduced some compiler techniques in this area. But we have no attribute that states that a symbol must remain global, and that it should have no internal or direct bindings established to it, and that it should be elevated above any local scope families created within a dynamically linked process. Perhaps with such a symbol attribute, assigned by the compilers for the symbols they know must be completely interposable, we'd establish a more robust environment.

Now, I wonder what name we'd give this new super-global attribute?



Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Wednesday Dec 14, 2005

Runtime Token Expansion - some clarification

I recently came across a mail exchange where the following runtime linker error message was observed:

   illegal mode: potential multiple path expansion requires RTLD_FIRST

The exchange, and a quick review of the documents, reveal that some explanation wouldn't go amiss.

The runtime linker, ld.so.1, provides a number of tokens that can be used within dynamic linking string definitions. These string definitions can provide filters, dependencies and runpath information, and are documented in the section Establishing Dependencies with Dynamic String Tokens of the Linker and Libraries Guide.

Presently, a dependency expressed within an object, points to a single file. For example:

    % elfdump -b main | fgrep NEEDED
           [0]  NEEDED      0x230   libc.so.1

Filtee definitions and runpaths however, are frequently defined as a lists of colon separated items. For example:

    % elfdump -d foo.so | egrep "FILTER|RPATH"
           [4]  FILTER       0xd8e   libbar.so.1:libnuts.so.1
          [13]  RPATH        0xbf3   /usr/ISVI/lib:/usr/ISVII/lib

The tokens $OSNAME, $OSREL, $PLATFORM and $ORIGIN all expand into a single string. For example:

    % elfdump -d libc_psr.so.1 | fgrep AUXILIARY
           [2]  AUXILIARY    0x56ea  /platform/$PLATFORM/lib/libc_psr.so.1

can expand at runtime into:

    /platform/SUNW,Sun-Blade-1000/lib/libc_psr.so.1

This single string expansion means that these tokens can be used in filter, dependency and runpath definitions.

The tokens $HWCAP and $ISALIST however, typically expand into a list of elements. For example:

    % elfdump -d bar.so | fgrep RPATH
           [4]  RPATH       0x1ad    /usr/ISV/$ISALIST
can expand at runtime into:
    search path=/usr/ISV/$ISALIST  (RPATH from file bar.so)
    trying path=/usr/ISV/sparcv9+vis2/libfoo.so.1
    trying path=/usr/ISV/sparcv9+vis/libfoo.so.1
    trying path=/usr/ISV/sparcv9/libfoo.so.1
    trying path=/usr/ISV/sparcv8plus+vis2/libfoo.so.1
    ....

This list is well suited for filter and runpath definitions, where lists are already expected. But what about dependency definitions? As our present implementation of dependency strings expects a single object, allowing a token that can expand into multiple objects was questioned. Basically, the infrastructure to assign multiple head objects to a handle isn't yet available, and we really don't know of anyone wanted this capability.

Because of these issues, we decided to restrict the use of $HWCAP and $ISALIST when used to define dependencies. If you use either of these tokens to establish dependencies, only the first object that is found from their expansion is used. Note that pathnames used for dependencies don't seem very common, but we've restricted their use for these tokens anyway.

Likewise, if you use these tokens in a dlopen(3c), only the first object found is applicable. But here we wanted the user to be explicit, and know what they are getting. Hence, was ask for the RTLD_FIRST flag, which happened to be lying around and seemed kind of appropriate. Without this flag you'll get the illegal mode error message. Of course, the RTLD_FIRST is now a little overloaded, it restricts symbol searches, and clarifies a dlopen() request that might expand to multiple objects. Oh well.

For dlopen(), we could have enforced the RTLD_FIRST flag internally, but felt that one day we might want to enable the opening of multiple objects from one dlopen(). Without the user explicitly defining todays requirement it would be hard to extend the capabilities at a future date. The observant members of the audience will of course point out that we didn't make a similar explicit requirement for NEEDED dependencies. Sigh.

The illegal mode error message is an attempt to make users aware of a token processing restriction, that may be lifted in future.


Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Thursday Oct 13, 2005

A Very Slow Link-Edit - get the latest patch

A customer recently posted to the Dynamic Linking forum in regards to an awfully slow link-edit. A shared library, built from Sun Studio 10 with debugging (-g), was taking 20 minutes to link on Solaris. By comparison, the same link took 4-5 minutes on Linux, and 15 seconds on Windows.

I got a copy of the objects and found that the link-edit was considerably faster on my desktop - less than a minute. This is no small link-edit. There are a number of very large input files, and in total, ld(1) processes 65057 input symbols, and the killer, over 1.3 million input relocations.

It turns out we'd already uncovered a scalability issue from investigating a slow link-edit from another customer. Basically there are some tests within ld(1) that attempt to identify displacement relocation use within data items that have the potential for copy-relocations. Not something typically users come across, but an area where our compiler engineers had once fallen foul. Thus the checks were requested by our compiler developers to aid their experimentation.

A patch already existed that addressed this slow link-edit, which was fixed under bugid 6262789. The patches are:

    Solaris/SunOS 5.10_sparc    patch 117461-03
    Solaris/SunOS 5.10_x86      patch 118345-03
    Solaris/SunOS 5.9_sparc     patch 112963-21
    Solaris/SunOS 5.9_x86       patch 113986-17
    Solaris/SunOS 5.8_sparc     patch 109147-36
    Solaris/SunOS 5.8_x86       patch 109148-36

The customer now has the relevant patch. Their link-time is down to 35 seconds.Which is still not as fast as Windows, so we still have some work to do. Perhaps the compilers could generate a little less for us to do :-).


Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Tuesday Sep 27, 2005

Init and Fini Processing - who designed this?

Recently we had to make yet another modification to our runtime .init processing to compensate for an undesirable application interaction. I thought an overview of our torturous history of .init processing might be entertaining.

During the creation of a dynamic object, the link-editor ld(1), arranges for any .init and .fini sections to be collected into blocks of code that are executed by the runtime linker ld.so.1(1). These blocks of code are typically used to implement constructors and destructors, or code identified with #pragma init, or #pragma fini.

The original System V ABI was rather vague in regard these sections, stating simply:

        After the dynamic linker has built the process image and performed the
        relocations, each shared object gets the opportunity to execute some
        initialization code.  These initialization functions are called in no
        specific order, but all shared object initializations happen before the
        executable file gains control.

        Similarly, shared objects may have termination functions, which are
        executed with the atexit(3c) mechanism after the base process begins
        its termination sequence. Once again, the order in which the dynamic
        linkers calls termination functions is unspecified.

The system has evolved since this was written, and today there is an expectation that any initialization functions are called before any code within the same object is referenced. This holds true for dependencies of the application and objects that are dynamically added to a process with dlopen(3c). The reverse is expected on process termination, and when objects are removed from the process with dlclose(3c).

Todays processes, language requirements (C++), lazy-loading, together with dlopen(), and dlclose() use, have resulted in a complex .init/.fini execution model that attempts to satisfy a users expectations. However, scenarios can still exist where expectations can not be achieved. Typically, the implementation details of .init/.fini processing aren't something any developers design to. But as the dynamic dependencies of a process change, unforeseen side-effects of this processing can cause problems. Given the complexity of todays processes, these problems can be difficult to detect let alone prepare against.

For the rest of this discussion, let's use .init sections for examples.

At First it was Simple

In the early days of Solaris, .init sections were run in reverse load order, sometimes referred to as reverse breadth first order. If an application had the following dependencies:

        % ldd a.out
                lib1.so.1 =>     /opt/ISV/lib/lib1.so.1
                lib2.so.1 =>     /opt/ISV/lib/lib2.so.1
                libc.so.1 =>     /usr/lib/libc.so.1

then the initialization sequence would be libc.so.1 followed by lib2.so.1 followed by lib1.so.1.

This level of simplicity proved insufficient for calling .init sections in their expected order. All that was required was for a dependency to have its own dependency. For example, if lib1.so.1 had a dependency on lib3.so.1, the load order would reveal:

        % ldd a.out
                lib1.so.1 =>     /opt/ISV/lib/lib1.so.1
                lib2.so.1 =>     /opt/ISV/lib/lib2.so.1
                libc.so.1 =>     /usr/lib/libc.so.1
                lib3.so.1 =>     /opt/ISV/lib/lib3.so.1

The result of this loading was that the .init for lib3.so.1 was called before the system library libc.so.1. In practice this wasn't a big issue back in our very early releases. Although libc.so.1 is typically the dependency of every application and library created, its .init used to contributed little that was required by the .init's of other objects.

Issues really started to arise as C++ and/or threads use started to expand. The libraries libC.so.1 and libthread.so.1, and the C++ objects themselves, had far more complex initialization requirements. It became essential that libC.so.1 and libthread.so.1's .init ran before any other objects .init.

Topological Sorting Arrives

In Solaris 6, the runtime linker started constructing a dependency ordered list of .init sections to call. This list is built from the dependency relationships expressed by each object and any bindings that have been processed outside of the expressed dependencies.

Explicit dependencies are established at link-edit i.e., lib1.so.1 needs lib3.so.1. However, explicit dependency relationships are often insufficient to establish complete dependency relationships. It is still very typical for a shared object to be generated that does not express its dependencies. Use of the link-editors -z defs option would enforce that dependencies be expressed, but this isn't always the case.

Therefore, the runtime-linker also adds any dependencies established by relocations to the information used for topological sorting. ldd(1) can be used to display expected .init order:

        % ldd -di a.out
                lib1.so.1 =>     ./lib1.so.1
                lib2.so.1 =>     ./lib2.so.1
                libc.so.1 =>     /usr/lib/libc.so.1
                lib3.so.1 =>     ./lib3.so.1

           init object=/usr/lib/libc.so.1
           init object=./lib3.so.1
           init object=./lib1.so.1
           init object=./lib2.so.1

But there's still something missing. The above example shows ldd processing only immediate (data) relocations. This is normal when executing an application, i.e., without LD_BIND_NOW being set. Typically, functions are not resolved when objects are loaded, but are done lazily when the function is first called.

The problem with the runtime linkers dependency analysis, is that without resolving all relocations, including functions, the exact dependency relationship may not be known at the time .init firing starts.

Of course, we had one library that had to be treated differently, libthread.so.1. Even though this library could have dependencies on other system libraries, it's .init had to fire first. This was insured with the .dynamic flag DF_1_INITFIRST. But, this also excited others who claim they'd like to be first too! Better mechanisms have since evolved to insure the new merged libthread and libc are initialized appropriately.

As a side note, when topological sorting was added, the environment variable LD_BREADTH was also provided. This variable suppressed topological sorting and reverted to the original breadth first sorting. This fall back was provided in case applications were found to be dependent upon breadth first sorting, or in case bugs existed in the topological sort mechanism. Sadly, the latter proved true, and LD_BREADTH found its way into scripts and user startup files. But, as systems became more complex, LD_BREADTH became increasingly inappropriate, and its existence caused more problems than it solved. This environment variables processing was finally removed.

Note, the debugging capabilities that are available with the runtime linker in OpenSolaris have been significantly updated so that LD_DEBUG=init,detail provides detailed information on the topological sorting process.

Dynamic .init Calling

To complete the initialization model, each time the runtime-linker resolves a function call (through a .plt), the defining objects .init is called if it hasn't already executed. With dynamic .init calling, a lazily bound family of objects can assume that an objects .init is called before code in that object is referenced.

Suppose in our a.out example, that the .init code executed for lib3.so.1 makes a call to lib2.so.1. It follows that the .init for lib2.so.1 should be called before it had previously been scheduled. This dynamic initialization can not be observed under ldd, but can be seen with the runtime-linkers debugging:

        % LD_DEBUG=init a.out
        ......
        09086: calling .init (from sorted order): /usr/lib/libc.so.1
        09086:
        09086: calling .init (done): /usr/lib/libc.so.1
        09086:
        09086: calling .init (from sorted order): ./lib3.so.1
        09086:
        09086: calling .init (dynamically triggered): ./lib2.so.1
        09086:
        09086: calling .init (done): ./lib2.so.1
        09086:
        09086: calling .init (done): ./lib3.so.1
        09086:
        09086: calling .init (from sorted order): ./lib1.so.1
        09086:
        09086: calling .init (done): ./lib1.so.1

But, there is still something missing. What if a family of objects are not lazily bound? And, can a user assume that an objects .init has completed before the code in that object is referenced?

Loss of Lazy Binding

It has been observed that users frequently call dlopen() with RTLD_NOW. This results in functions being bound at relocation time. An interesting side-effect is how this can interact with .init firing.

Suppose a family of objects have been loaded under the default mode of lazy binding. The runtime linker has sorted this family and is in the process of firing the .init's. Now let's say that a particular .init calls dlopen() with RTLD_NOW, on members of this already loaded family. Effectively, this dlopen() may have altered the dependency ordering, as function relocations would have contributed to the topological sorting process. In addition, as there are no longer outstanding function relocations (.plt's) that would give the runtime-linker control, no dynamic .init calling for these fully relocated objects can be performed.

This observation prompted an additional level of dynamic .init calling. When ever a family of objects are dlopen()'ed, any .init sections for those objects that have not been called, would be sorted and called before return from the dlopen().Although this has always occurred for newly loaded objects, any existing objects .init's would have been skipped as they were already part of a pending initialization thread. By always collecting the .init's of a dlopen() family we could compensate for any loss of lazy binding, and insure that an objects .init is called before their code is first referenced.

No Lazy Binding

Objects can be loaded without lazy-binding, either under control of LD_BIND_NOW, the dlopen() flag RTLD_NOW, or because they were built with link-editors -z now option. Although the relocation information provided to the topological sorting may be more precise than under lazy-loading, there is no longer any dynamic .init calling possible.

Cyclic Dependencies

Cyclic dependencies seem to be quite common. It has been observed that calling one objects .init can result in one or more other objects being called, which in turn reference code from the originating object. Problems start manifesting themselves when this return to the originating object exercises code whose initialization has not yet completed.

When topological sorting detects cycles, the members of the cycles will have their .init's fired in reverse load order. With dynamic .init calling, this order may be a fine starting point, but without dynamic .init calling this order may not be sufficient to prepare for the execution path of code throughout the cyclic objects.

A similar issue has arisen between different threads of control. One thread may be in the process of calling an .init and get preempted for another thread that references the same object. In fact, the runtime linker is fully capable of using condition variables to synchronize .inituse. However, this functionality is not enabled by default because of cyclic dependencies, and the possible deadlock conditions that can result.

Recursion with dlopen(0)

When we added the reevaluation and firing of .init's if any loaded objects where referenced by a dlopen(), dlopen(0) fell under the same model. And, although this model was integrated into Solaris a couple of years ago, it wasn't until recently that some existing applications were observed to fail because of the .init reevaluation.

The problem was that unexpected recursion was occurring. .init's were being fired, and then the code within a .init called dlopen(0). This caused a reevaluation of the outstanding .init sections, some .init's to fire, and a thread of execution lead to running code within an object whose .init had not yet completed.

Yet another refinement was added. Now, if any dlopen() operation only references objects that are already loaded, and that dlopen() operation does not promote any objects relocation requirements, i.e., doesn't use RTLD_NOW, then the thread of initialization that must presently be in existence is left to finish the job. If however, the dlopen() operation adds new objects, or promotes any existing objects relocations, then the family of objects referenced will have their .init's reevaluated, and a new thread of initialization is kicked off to process this family.

Conclusions

As you can now see, the dynamic initialization of objects within a process is quite complex, and no-one in the right mind would have ever designed it this way. This complexity has evolved, from some incredibly vague starting point, to an implementation that has become necessary to satisfy existing dynamic objects. Whether a lot of this complexity is by accident or by design, is open to debate. And whether developers have ever considered initialization requirements or designed to some goal, seems doubtful. Given the inability to initialize groups of cyclic dependencies in a correct order, you have to wonder if users ever meant to create such an environment. Personally I think most initialization "requirements" have worked more through luck than judgment,

I've tried to provide documentation to educate users of the issues facing object initialization. However, the Linker and Libraries Guide isn't the first place folks look. Documentation should probably start with the compiler documents, which are the first place developers usually go. Documenting how users should use .init's and .fini's might also be useful. Although this really falls back to the languages, like C++, to document constructor and deconstructor use.

Better methods of analyzing initialization dependencies may also be useful. The runtime linkers debugging capabilities are a start. Objects can also be exercised under ldd using the -i and -d/r options. But this still seems a little too late in the development cycle. It would be nice if we could flag things like cyclic dependencies during object development. Specifically, the cyclic dependencies of .init's and .fini's. However, one problem with todays applications is that they are not all created in one build environment. Many components, asynchronously delivered from different ISV's are brought together to produce a complete application.

The bottom line is keep it simple. Try and get rid of .init code. It's amazing how much exists, and what it does (dlopen()/dlclose(), firing off new threads, I've even seen .init sections fork()/exec() processes). .init code often initializes an object for all eventualities, whereas much of the initialization is never used for a particular thread of execution. Make the initialization self contained by eliminating and reducing the references to external objects from the initialization code. Reducing exported interfaces can help too. By keeping things simple you can avoid much of the initialization interactions that the runtime linkers implementation has evolved to handle. Your process will start up faster too. Folks often comment how long it takes for a process to get to main. Have a look how much initialization processing comes before this!


Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Friday Sep 09, 2005

Finding Symbols - reducing dlsym() overhead

In a previous post, I'd explained how lazy loading provides a fall back mechanism. If a symbol search exhausts all presently loaded objects, any pending lazy loaded objects are processed to determine whether the required symbol can be found. This fall back is required as many dynamic objects exist that do not define all their dependencies. These objects have (probably unknowingly) become reliant on other dynamic objects making available the dependencies they need. Dynamic object developers should define what they need, and nothing else.

dlsym(3c) can also trigger a lazy load fall back. You can observe such an event by enabling the runtime linkers diagnostics. Here, we're looking for a symbol in libelf from an application that has a number of lazy dependencies.

    % LD_DEBUG=symbols,files,bindings  main
    .....
    19231: symbol=elf_errmsg;  dlsym() called from file=main  [ RTLD_DEFAULT ]
    19231: symbol=elf_errmsg;  lookup in file=main  [ ELF ]
    19231: symbol=elf_errmsg;  lookup in file=/lib/libc.so.1  [ ELF ]
    19231:
    19231: rescanning for lazy dependencies for symbol: elf_errmsg
    19231:
    19231: file=libnsl.so.1;  lazy loading from file=main: symbol=elf_errmsg
    ......
    19231: file=libsocket.so.1;  lazy loading from file=main: symbol=elf_errmsg
    ......
    19231: file=libelf.so.1;  lazy loading from file=main: symbol=elf_errmsg
    ......
    19231: symbol=elf_errmsg;  lookup in file=/lib/libelf.so.1  [ ELF ]
    19231: binding file=main to file=/lib/libelf.so.1: symbol `elf_errmsg'

Exhaustively loading lazy dependencies to resolve a symbol isn't always what you want. This is especially true if the symbol may not exist. In Solaris 10 we added RTLD_PROBE. This flag results in the same lookup semantics as RTLD_DEFAULT, but does not fall back to an exhaustive loading of pending lazy objects. This handle can be thought of as the light weight version of RTLD_DEFAULT.

Therefore, if we wanted to test for the existence of a symbol within the objects that were presently loaded within a process, we could use dlsym() to probe the process:

    % LD_DEBUG=symbols,files,bindings  main
    .....
    19251: symbol=doyouexist;  dlsym() called from file=main  [ RTLD_PROBE ]
    19251: symbol=doyouexist;  lookup in file=main  [ ELF ]
    19251: symbol=doyouexist;  lookup in file=/lib/libc.so.1  [ ELF ]
    ......
    19251: ld.so.1: main: fatal: doyouexist: can't find symbol

When dlsym() is used to locate symbols from a handle returned by dlopen(3c), all the dependencies associated with the handle are available to the symbol lookup. I always thought this was rather odd, and that dlsym() should only look at the initial object of a handle. In other words, if you:

    if ((handle = dlopen("foo.so", RTLD_LAZY)) != NULL) {
            fprt = dlsym(handle, "foo");

then intuitively the search for foo would be isolated to foo.so, and not include the multitude of dependencies also brought in by foo.so. But, that is not how our founding fathers developed dlsym(). I think we even considered changing the behavior once so that dlsym() would only search the initial object. But we soon found a number of applications fell over as they could no longer find the symbols they were used to finding.

In Solaris 9 8/03 we provided an extension to dlopen() with the new flag RTLD_FIRST. By using this flag the same series of objects are opened and associated with a handle. However, only the first object on the handle is made available for dlsym() searches.

Perhaps RTLD_PROBE and RTLD_FIRST can reduce your dlsym() overhead.


Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Tuesday Jun 14, 2005

The Link-editors - a source tour

Welcome to OpenSolaris. I've been working with the link-editors for many years, and I thought that with the general availability of the source, now would be an opportune time to cover some history, and give a brief overview of the link-editors source hierarchy.

The link-editor components reside under the usr/src/cmd/sgs directory. This Software Generation Subsystem hierarchy originated from the AT&T and Sun collaboration that produced Solaris 2.0. Under this directory exist the link-editors, and various tools that manipulate or display ELF file information. There are also some ancillary components that I've never modified. I believe at some point it may also have contained compilers, however these have long since moved to their own separate source base.

The Link-Editor

When you mention the link-editor, most folks think of ld(1). You'll find this under sgs/ld. However, this binary is only a stub that provides argument processing and then dynamically loads the heart of the link-editor, libld.so. This library provides two flavors, a 32-bit version, and a 64-bit version, both capable of producing a 32-bit or 64-bit output file. The class of library that is loaded, is chosen from the class of the first input relocatable object read from the command line. This model stems from a compiler requirement that the link-editor class remain consistent with various compiler subcomponents.

The Runtime Linker

However, there's another link-editor that is required to execute every application on Solaris. This editor takes over where the standard link-editor left off, and is referred to as the runtime-linker, ld.so.1(1). You can find this under sgs/rtld. The runtime linker takes an application from exec(2), loads any required dependencies, and binds the associated objects together with the information left from ld(1). The runtime linker can also be called upon by the application to load additional dependencies and locate symbols.

This very close association of ld(1) and ld.so.1(1), is one reason the link-editors are considered part of the core OS rather than a component of the compilers. This separation has also insured the link-editors are compiler neutral.

One historic area of the runtime linker is its AOUT support. Objects from our SunOS4.x release were in AOUT format, and to aid customer transition from this release to Solaris, support for executing AOUT applications was provided by ld.so.1(1). We keep thinking that we're long past this transition need, and that this support could be purged from the system. However, we continue to come across customers that are still running an AOUT binary on Solaris. Sometimes the customer is Sun!

Also, if you poke around the relocation files for ld(1) and ld.so.1(1), you'll find a mechanism for sorting and counting relative relocations. This allows a faster processing loop for these relocations at runtime. Bryan did this before going on to bigger and better projects. It took him a couple of deltas to get things right, but he was a young lad back then.

Support Libraries

There are various support libraries employed by the link-editors. A debugging library, liblddbg.so, is employed by ld(1), ld.so.1(1) and elfdump(1) to provide tracing diagnostics. A common library is used to insure the debugging information looks consistent between the various tools. ld(1) uses libldmake.so to provide .make.state support, and libldstab.so for generic .stabs processing. ld.so.1(1) uses librtld.so for extending the runtime dynamic linking support, and librtld_db.so for mdb(1) and various proc tool support. ld.so.1(1) also used libld.so to process relocatable objects.

As you can see, there is a lot of interrelationships between the various components of the link-editors. The interfaces between these components are private and often change. When providing updates to the link-editors in patches and updates, this family of components is maintained and supplied as a whole unit.

Proto Build

As part of building the link-editor components, you might notice that we first build a version of ld(1) under sgs/proto, then use this version of ld(1) to build the other link-editor components, including the final ld(1). This two-stage build has developed as we frequently use new link-editor capabilities and flags to build our own components. A case of eating your own dog food. Without this two-stage build we would first have to integrate a version of ld(1) that provides the new capabilities, wait a week or two for this version of ld(1) to propagate into developers build environments, and then integrated the updates that require to use the new capabilities. Our two-stage build makes for a much faster turn-around. And, should we break something, we're usually the first to find out as we develop our changes.

Package Build

Under sgs/packages you'll see we have the capability of building our own package. This isn't the official package(s) that the link-editors are distributed under, but a sparse package, containing all our components. This package is how we install new link-editors quickly on a variety of test machines, or provide to other developers to test new capabilities or bug fixes, before we integrate into an official build. Note, there's no co-ordination between this package and the official package database, it's really no different than tar(1)'ing the bits onto your system, except you can back the changes out!

Patches

We make a lot of patches. Sure, there are bugs and escalations that need resolving, but we frequently have to make new capabilities available on older releases. The compilers are released asynchronously from the core OS, and new capabilities required by these compilers must be made available on every release the compilers are targeted to.

We have a unique way of generating patches. When asked to generate a patch we typically backport all the latest and greatest components. As I described earlier, there's a lot of interaction between the various components, and thus trying to evaluate whether an individual component can be delivered isn't always easy. So, we've cut this question out of the puzzle from the start, and always deliver all the link-editor components as a family.

Trying to isolate a particular bug fix can also be challenging. It may look like a two line code fix addresses a customer escalation, but these two lines are often dependent on some other fixes, in other files, that occurred many months before. Trying to remember, and test for all these possible interactions can be a nightmare, so we've removed this question from the puzzle too.

When we address a bug, we address it in the latest source base. If the bug can't be duplicated, then it may have been fixed by some previous change, in which case we'll point to the associated patch. Otherwise, we'll use all the resources available on the latest systems to track down and fix the issue. Yep, that means we get to use the latest mdb(1) features, dtrace(1M), etc. There's nothing more frustrating that having to evaluate a bug on an old release where none of your favorite tools exist. Sometime we have to fall back to an older release, but we try and avoid it if we can.

Having a fix for the issue, we'll integrate the changes in the latest Solaris release. And, after some soak time, in which the fix has gone through various test cycles and been deployed on our desktops and building servers, we'll integrate the same changes in all the patch gates. Effectively, we're only maintaining one set of bits across all releases. This greatly reduces the maintenance of the various patch environments, and frees up more time for future development.

This model hasn't been without some vocal opponents - "I want a fix for xyz, and you're giving me WHAT!". But most have come around to the simplicity and efficiency of the whole process. Is it flawless? No. Occasionally, regressions have occurred, although these are always in some area that has been outside of the scenarios we're aware of, or test for. Customers always do interesting things. But it will be a customer who finds such as issue, either in a major release, update or patch. Our answer is to respond immediately to any such issues. Our package build comes in very handy here.

You can always find the bugs we've fixed and patches generated from our SUNWonld-README file.

Other Stuff

Some other support tools are elfdump(1), ldd(1), and pvs(1). And, there's crle(1), moe(1), and lari(1). I quite enjoyed coming up with these latter names, but you might need some familiarity with old American culture to appreciate this. Which is rather odd in itself, as I'm British.


Anyway, hopefully this blog has enlightened you on navigating the OpenSolaris hierarchy in regard the link-editors.

Have fun, and in respect for a popular, current media event - may the source be with you.


Technorati Tag:
Technorati Tag:

About

user12613883

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today