LD_LIBRARY_PATH - just say no

A recent email discussion reminded me of how fragile, and prevalent, LD_LIBRARY_PATH use it. Within a development environment, this variable is very useful. I use it all the time to experiment with new libraries. But within a production environment, use of this environment variable can be problematic. See Directories Searched by the Runtime Linker for an overview of LD_LIBRARY_PATH use at runtime.

People use this environment variable to establish search paths for applications whose dependencies do not reside in constant locations. Sometimes wrapper scripts are employed to set this variable, other times users maintain an LD_LIBRARY_PATH within their .profile. This latter model can often get out of hand - try running:

    % ldd -s /usr/bin/date
    ...
    find object=libc.so.1; required by /usr/bin/date
	search path=/opt/ISV/lib	 (LD_LIBRARY_PATH)

If you have a large number of LD_LIBRARY_PATH components specified, you'll see libc.so.1 being wastefully searched for, until it is finally found in /usr/lib. Excessive LD_LIBRARY_PATH components don't help application startup performance.

Wrapper scripts attempt to compensate for inherited LD_LIBRARY_PATH use. For example, a version of acroread reveals:

    LD_LIBRARY_PATH="`prepend "$ACRO_INSTALL_DIR/$ACRO_CONFIG/lib:\\
	$ACRO_INSTALL_DIR/$ACRO_CONFIG/lib" "$LD_LIBRARY_PATH"`

The script is prepending its LD_LIBRARY_PATH requirement to any inherited definition. Although this provides the necessary environment for acroread to execute, we're still wasting time looking for any system libraries in the acroread sub-directories.

When 64-bit binaries came along, we had a bit of a dilemma with how to interpret LD_LIBRARY_PATH. But, because of its popularity, it was decided to leave it applicable to both class of binaries (64 and 32-bit), even though its unusual for a directory to contain both 64 and 32-bit dependencies. We also added LD_LIBRARY_PATH_64 and LD_LIBRARY_PATH_32 as a means of specifying search paths that are specific to a class of objects. These class specific environment variables are used instead of any generic LD_LIBRARY_PATH setting.

Which leads me back to the recent email discussion. Seems a customer was setting both the _64 and _32 variables as part of their startup script, because both 64 and 32 bit processes could be spawned. However, one spawned process was acroread. Its LD_LIBRARY_PATH setting was being overridden by the _32 variable, and hence it failed to execute. Sigh.

Is there a solution to this mess? I guess we could keep bashing LD_LIBRARY_PATH into submission some way, but why not get rid of the LD_LIBRARY_PATH requirement altogether? This can be done. Applications and dependencies can be built to include a runpath using ld(1), and the -R option. This path is used to search for the dependencies of the object in which the runpath is recorded. If the dependencies are not in a constant location, use the $ORIGIN token as part of the pathname.

Is there a limitation to $ORIGIN use? Yes, as directed by the security folks, expansion of this token is not allowed for secure applications. But then again, for secure applications, LD_LIBRARY_PATH components are ignored for non-secure directories anyway. See Security.

For a flexible mechanism of finding dependencies, use a runpath that includes the $ORIGIN token, and try not to create secure applications :-)

Comments:

Dude, if you want people to stop using LD_LIBRARY_PATH, there are two easy things to help. (1) create LD_LIB_SUBSTITUTE for developer use. It would have a value like: "/lib/libc.so.1=/home/user/proj/mylibc.so.1" When the runtime linker sees the first library, it automatically uses the second library in place of it. Also, since many different pathnames can find the same file, you'd want the comparison of the first pathname to be based on inode# if possible. This would let devlopers move away from using LD_LIBRARY_PATH. (2) In many cases, the person writing the start-up script doesn't have the ability to recompiler and relink the application they are wrapping. If there was a utility like /bin/rpath that people could use to patch an executable to contain a new rpath setting, then I bet lots of people would use it to repair badly linked binaries instead of being forced to work around bugs in other people's programs by creating wrapper scripts. --chris

Posted by Chris Quenelle on July 11, 2004 at 08:23 AM PDT #

I sometimes use LD_PRELOAD rather than LD_LIBRARY_PATH, which operates somewhat similarly to the LD_LIB_SUBSTITUTE mentioned by Chris. LD_PRELOAD=/home/usr/proj/mylibc.so.1 will cause mylibc.so.1 to be loaded and checked for the function and if found will not continue to search for the function. It all works fine until you run into a program that sets the environment pointer to null, but then none of these variables will work anyway!

Posted by Albert White on July 11, 2004 at 09:14 AM PDT #

Why not avoid the whole matter and rely on crle? Or is this to simply avoid root access? While there are still good cases in while LD env's are useful, it seems like 90% of the times that people use LD env's they should be using crle.

Posted by benr on July 11, 2004 at 10:39 AM PDT #

Perhaps the point of my posting has been missed. The intent was to suggest folks use a runpath with $ORIGIN, rather than expect LD_LIBRRAY_PATH to be set. LD_LIBRARY_PATH, or crle(1) provide a global variable, that multiple folks wish to manipulate in order for their processes to run. This multiple interaction is the source of the problem. And this isn't going be solved by adding more LD_ environment variables. A runpath is a local variable. It pertains to the object that defines it, and is used only to find that objects dependencies. This "localization" shelters the search path from external mismanagment. [Also, I believe most wrappers are delivered with their underlying components - acroread is a typical example. Plus, you can't edit an existing runpath - well, you could make it smaller, but not larger - as the string is part of the read-only text segment of the object].

Posted by Rod Eans on July 12, 2004 at 05:35 AM PDT #

You can make the RPATH larger, it's just more work. You can append the new string to the end of the read only data section and point to the new string. You don't have to re-link the entire app. Or so it seems to me. A utility to fix the RPATH will allow a larger number of people to "do the right thing" Not just the original creator of the software. Look at it as a marketing thing. --chris

Posted by Chris Quenelle on July 20, 2004 at 03:09 PM PDT #

The read only data section is part of the text segment. When the memory image of the file is created, there is typically space between the text and data segments. But within the file image, the text and data segments are typically adjacent. Thus there is no space to put this new string.

If we squeezed a new string between the text and data, the data location has changed, which would invalidate offsets and relocations that had already been established during the link-edit of the file.

Plus, the runpath index from the <tt>.dynamic</tt> table is an index into the dynamic string table <tt>.dynstr</tt>. If we were create a runpath in some other section, and fabricate an index to it, its quite possible that some <tt>ELF</tt> tools will flag this index as invalid.

And, you can't squeeze a new string at the end of the writable data segment either, as this is where the <tt>.bss</tt> has been established, and gets zeroed at runtime.

Posted by Rod Evans on July 22, 2004 at 03:12 AM PDT #

Well, if we introduce a new elf section (or use some other section that can be extended) and let the runtime linker check the section and get the rpath from there and override existing rpath, that would work. Another even more clunky workaround is to fix the compiler/linker to add a fixed size ( MAXPATH ?) empty rpath to executables if it was built without explicit rpath with $ORIGIN macro. Of course, this won't help the existing binaries, whereas the former approach would work, but that requires new linker. I think it's hopeless to wait for them to fix the way they build their program since lazy developers will be lazy no matter how hard we try to educate them. So I think we need some mechanism to address this.

Posted by Seongbae Park on July 23, 2004 at 10:58 AM PDT #

Yes, we've considered all these, but as you point out, they don't fix old objects. And if folks can rebuild their objects they should consider <tt>$ORIGIN</tt> route.

I don't think developers are lazy, they just don't always grasp some of the fine points of delivering a product into an environment where it can be used in multiple ways. Those that know of <tt>$ORIGIN</tt> are using it. Linux implemented it too. The intent of this article was simply to continue spreading the word.

Posted by Rod Evans on July 26, 2004 at 09:32 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

user12613883

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today