Purging LD_LIBRARY_PATH

Java developers are familiar with dynamic linking. Class files are a kind of intermediate format with symbolic references. At runtime, a class loader will load, link, and initialize new types as needed. Typically the full classpath a class loader uses for searching will have several logically distinct sub components, including the boot classpath, endorsed standards, extension directories, and the user-specified classpath. The manifest of a jar file can also contain Class-Path entries. Together, these paths delineate the boundaries of "jar hell."

For many years, modern Unix systems have also supported dynamic linking for C programs. Instead of a classpath, there is a runpath of locations to look to for resolving symbolic references. Like the classpath, the full runpath has multiple components, including a default component for system facilities (analogous to boot classpath), a component stored in a shared object (analogous to jar file Class-Path entries), as well as an end-user specified component (analogous to the -classpath command line option or CLASSPATH environment variable). The details of linking on Solaris are well explained in Sun's Linker and Libraries Guide. Other contemporary Unix platforms like Linux and MacOS have similar facilities, although the details of the various commands differ.

One of the tasks the JDK's launcher has handled is setting a suitable runpath for the JVM and platform libraries. Historically a runpath was needed to link in the desired JVM, such as client or server, and other system libraries. The client JVM and the server JVM are separate shared objects which support the same set of interfaces; by interpreting the command line flags the launcher selects which JVM to link in. Operationally, the linking is initiated by the Unix dlopen library call. So that the caller of the java command did not need to set LD_LIBRARY_PATH, after selecting the JVM to run the launcher would modify the LD_LIBRARY_PATH environment variable by prepending the path to the JVM shared object (and paths to other directories with JDK native system libraries). However, the runtime linker only reads the value of LD_LIBRARY_PATH when a process starts. Therefore, to have the new value take effect, the launcher would call an exec-family system call to start the process anew. Such re-execing to set LD_LIBRARY_PATH is not recommended practice on Unix systems.

The re-execing to set LD_LIBRARY_PATH had a number of unpleasant consequences in the launcher code. There is only a narrow path to pass information between the exec parent and the exec child, such as by modifying environment variables, which is generally discouraged. To decide whether or not an exec was needed, the launcher checked whether the prefix of LD_LIBRARY_PATH had the expected value; if it did, no exec was done for that purpose and infinite exec loops were avoided. Presetting LD_LIBRARY_PATH to the right value before calling java could thus be used to suppress the exec. There were also complications with correctly supporting multiple LD_LIBRARY_PATH variables on Solaris1 and handling suid java executions on Linux.2

The proper way to accommodate such dependencies is not to set LD_LIBRARY_PATH but rather to use the runtime linker facilities analogous to jar file Class-Path entries; the facility is the $ORIGIN dynamic string token for the runtime linker. As the name implies, $ORIGIN is expanded to the path directory of the object file in question; thus relative paths to other directories can be specified. Therefore, as long as the directory structure of the JDK and JRE are known, $ORIGIN can be used to record any necessary dependencies.

For some time, the JDK build has actually used $ORIGIN in creating its native libraries. Therefore, it may have been the case that LD_LIBRARY_PATH was not actually needed. However, verifying that LD_LIBRARY_PATH was not actually needed would require building an exec-free JDK on all supported Unix platforms and running tests that exercise the all libraries in the directories no longer added to LD_LIBRARY_PATH. The engineering for Kumar's purge of execing for LD_LIBRARY_PATH was generally straightforward: deleting the the LD_LIBRARY_PATH-related code in the Unix java_md.c file and doing builds on all platforms. Most of the effort of getting this fix back involved running tests to verify everything still worked. The testing revealed an unneeded, troublesome symlink that was removed at the same time LD_LIBRARY_PATH usage was purged.

While the launcher no longer execs to set the LD_LIBRARY_PATH, there are still cases where an exec will occur for other reasons. If the java command is requested to change data models using the -d32 or -d64 flag, that is, a 32-bit java command is asked to run a 64-bit JVM or vice versa, an exec is needed to effect the change. Also, multiple JRE support, where a different version is requested via the -version:foo flag, will also cause an exec if a different Java version needs to be run. However, before Kumar's fix the common case was that the launcher would exec once; now the common case is that the launcher will exec zero times.3

I'm very happy this messy use of LD_LIBRARY_PATH has finally been removed in JDK 7. The removal makes the launcher code both simpler and more maintainable. Unless your use of java relies on the number of execs that occur, the change should be largely transparent, other than startup being marginally faster. One situation to be aware of is launching a LD_LIBRARY_PATH-free JDK 7 java command from a JDK 6 or earlier java process. If the LD_LIBRARY_PATH variable of the older JDK is not cleared, it can affect the liking of the JDK 7 process.


1 Since Solaris 7, that OS line has supported three LD_LIBRARY_PATH variables:

  • LD_LIBRARY_PATH_32: if set, overrides LD_LIBRARY_PATH for 32-bit processes.

  • LD_LIBRARY_PATH_64: if set, overrides LD_LIBRARY_PATH for 64-bit processes.

  • LD_LIBRARY_PATH: used by both 32-bit and 64-bit processes is not overridden by a data model specific variable.

On Solaris, back in JDK 1.4.2 I fixed the launcher to properly take into account all three variables (4731671); on re-exec the data model specific environment variable is unset and LD_LIBRARY_PATH contains the old data model specific value prepended with the JDK system paths. Tests to verify all this used to live in and around test/tools/launcher/SolarisDataModel.sh, but they have thankfully been deleted as they are no longer relevant.

2 For suid or sgid binaries, LD_LIBRARY_PATH is handled differently to avoid security problems. While the Solaris runtime linker applies more scrutiny to LD_LIBRARY_PATH in this case, on Linux glibc sets LD_LIBRARY_PATH to the empty string. Since the empty string will not contain the expected JDK system directories, the prefix-checking logic detected this case to avoid an infinite exec loop (4745674). Running java suid or sgid isn't necessarily recommended, but it is possible. To actually resolve linking dependencies for such binaries, OS-specific configuration may be needed to add JDK directories to the set of trusted paths.

3 Before my batch of launcher fixes in JDK 1.4.2, the number of execs was even more varied. Specifying a different data model would exec twice, once to change the data model and again to set the LD_LIBRARY_PATH for that data model. From JDK 1.4.2 until the purge of LD_LIBRARY_PATH, the launcher used a single exec to set the LD_LIBRARY_PATH to the target data model (4492822).

Comments:

Since Linux distributions routinely re-layout SUN-built JVM files because SUN never bothered to publish a JVM that conformed to accepted Linux standards (ie the FHS which is part of the LSB), there is a high chance this will break all over the place.

The correct way to eliminate LD_LIBRARY_PATH tricks is not to use $ORIGIN trickery but to version JVM libs properly so they can work from any place in the system default LD_LIBRARY_PATH.

Posted by Nicolas Mailhot on February 01, 2010 at 06:15 PM PST #

Does this effect Windows based versions?

I am unaware of any similar LD_LIBRARY_PATH in Windows. I can only think of system libraries found in the Windows system directories, libraries found within the executable directory itself, or maybe some library path specified in a configuration of some way (Registry, .ini, custom config file, etc).

Posted by Eric on February 01, 2010 at 10:11 PM PST #

@Eric,

No, there is no analogous launcher issue with linking on Windows.

Posted by Joe Darcy on February 02, 2010 at 01:03 AM PST #

Hello Joe,

Is there any chance it has some positive impact on http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6919633 (set posix capabilities result in a loop) ?

Rgs,
JB

Posted by Jean-Baptiste Bugeaud on March 22, 2010 at 12:16 AM PDT #

@JB,

I'm not aware offhand if the change in LD_LIBRARY_PATH handling would help the bug you're seeing, but I encourage you to download a current JDK 7 build and give it try to see if it does.

Posted by Joseph Darcy on March 22, 2010 at 05:25 AM PDT #

Well, the result is a little different as with latest JDK7 I get :

./java: relocation error: ./java: symbol JLI_Launch, version SUNWprivate_1.1 not defined in file libjli.so with link time reference

And even if I put a /etc/ld.so.conf.d/java.conf before a ldconfig...

I put a right like :
sudo setcap 'cap_net_bind_service=+epi' ./java
I set the whole VM to a user glassfish:admin and made the try with :
sudo -u glassfish ./java

This would be a real improvement for JavaEE appserver users if we find a way to make it work on JDK7.

Posted by bjb on March 22, 2010 at 09:33 AM PDT #

@JB: Were you able to resolve the capability issue yet? Searching for a solution, too :-/

Posted by foo on April 13, 2010 at 03:57 AM PDT #

Hi @Joe

I just got a workaround for http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6919633 working with the 1.7.0-ea-fastdebug-b88 :)

Still need some help (maybe at the ld team) to figure out why is the token expansion failing in linux when capabilities are set.

Rgs,
JB

Posted by bjb on April 15, 2010 at 02:09 AM PDT #

Hi Joe,

nice post, even nicer change!

This will also make debugging easier (no more "Wizard" knowledge needed about how to correctly set the LD_LIBRARY_PATH to prevent execvs)

Regards,
Volker

Posted by Volker Simonis on April 16, 2010 at 01:49 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

darcy

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
News

No bookmarks in folder

Blogroll