Introducing elfedit: A Tool For Modifying Existing ELF Objects

Back in June, I wrote about changes we've recently made to Solaris ELF objects that allow their runpaths to be modified without having to rebuild the object. In that posting, I alluded to work that I was then doing when I said "Eventually, Solaris will ship with a standard utility for modifying runpaths". I am happy to say that this has come to pass. I recently integrated /usr/bin/elfedit into build 75 of Solaris Nevada with:
PSARC 2007/509 elfedit
6234471 need a way to edit ELF objects
elfedit can indeed modify the runpath in an object, but it is considerably more general than that. elfedit is a tool for examining and modifying the ELF metadata that resides within ELF objects. It can be used as a batch mode tool from shell scripts, makefiles, etc, or as an interactive tool, for examining and exploring objects. elfedit has a modular design, and ships with a set of standard modules for performing common edits. This design makes it easy to add new functionality by adding additional modules.

Prior to elfedit, making these sorts of modifications required the user to write a program, usually in C using libelf. elfedit raises the programming level required to do this significantly. Many operations can be done using existing elfedit commands. For those that cannot, it is far easier to write an elfedit module to add the ability than it is to write a standalone program.

We envision elfedit being used to solve the following sorts of problems:

[Small Fixups]
To correct minor issues in a built file that cannot be easily rebuilt, or for which sources are not available.

Probably the most notable such item is the ability to alter the runpath of objects built following the integration of

PSARC 2007/127 Reserved space for editing ELF dynamic sections
6516118 Reserved space needed in ELF dynamic section and string table
The ability to do this is a "Frequently Asked Question" for which there has previously been no good answer. This feature is expected to be used nearly as soon as it is available, to fix the runpaths of FOSS (free open source software) built for Solaris, which often has the wrong runpaths set.

Another common situation is when programmers forget to explicitly add the libraries they depend on to the link line, relying on indirect dependencies to make things work. elfedit can be used to add NEEDED dependencies to an existing object's dynamic section, making the dependencies explicit.

[Better Way To Support Specialized Rarely Used Features]
As an avenue for delivering small features to change some object attributes without the need to add additional complex and specialized features to ld and ld.so.1.

For example, we have had requests to allow a mechanism to ld that would allow the user to override the hardware capability bits that are placed in the object by the compiler. Such a feature would be complex to document and burdens already complex commands with features that are rarely used. Such features are a natural fit to elfedit. (See the elfedit(1) manpage for an example of modifying the hardware capabilities).

[Linker Development]
We sometimes work on linker features that require objects with new values or flag bits that the compilers do not yet generate. elfedit allows us to set arbitrary values for such items quickly, and without having to write a program.
[Linker Testing]
Many bugs involve an object that is broken in some way. Once the bug is fixed, we need an object broken in that particular way for our test suite. There are several problems that arise:

  • Cataloging and archiving broken objects is time consuming and error prone.

  • Producing similarly broken objects for different platforms is not always possible.

  • As new platforms appear, we end up with coverage gaps where some platforms can do a given test and others cannot.

elfedit gives us the ability to build a simple object, and then break it intentionally in a specific and controlled manner. Tests can then be self contained, requiring no external data, and applicable to all relevant platforms.

elfedit's ability to extract specific bits of data from an object is very useful for object and linker testing.

Every elfedit module contains documentation for the commands it provides. This information is displayed using the built in help command, in a format that is based on that of Solaris manpages. The help strings in the standard elfedit modules supplied with Solaris are internationalized using the same i18n mechanisms employed by the rest of the linker software found under usr/src/cmd/sgs. Hence, all elfedit modules supplied by Sun will have complete documentation, and will support the necessary language locales.

As with any program that changes the contents of an ELF file, changes to an object by elfedit will invalidate any pre-existing elfsign signature. Assuming the changes are understood and acceptable to the signing authority, such objects will need to be signed after the edits are done.

Modular Design And Extensibility

elfedit has a modular design, reflecting our own experience with dynamic linking, and influenced heavily by the design of mdb, the modular debugger.

The elfedit program contains the code that handles the details of reading objects, executing commands to modify them, and saving the results. Very little of the code that performs the actual edits is found in elfedit itself. Rather, the commands exist in modules, which are sharable objects with a well defined elfedit-specific interface. elfedit loads needed modules on demand when a command from the module is executed. These modules are self contained, and include their own documentation in a standard format that elfedit can display using its help command.

The module forms a namespace for the commands that it supplies. Each module delivers a set of commands, focused on related functionality. A command is specified by combining the module and command names with a colon (:) delimiter, with no intervening whitespace. For example, dyn:runpath refers to the runpath command provided by the dyn module.

Module names must be unique. The command names within a given module are unique within that module, but the same command names can be used in more than one module. For example, most modules contain a command named 'dump', which is used to provide an elfedump-style view of the data.

We have adopted the following general rules of thumb for naming modules and commands:

  • The module name reflects the part of the ELF format the module addresses (ehdr, phdr, shdr, ...)

  • Commands that directly access a field in an ELF structure are given the name of the field (e.g. ehdr:e_flags).

  • Commands that are higher level have a simple descriptive name that reflects their purpose (e.g. dyn:runpath).

Give 'Em Enough Rope

elfedit is a tool for linker development and testing. As such, it follows the Unix tradition of doing what it's told, without a lot of noise. This is great if you are doing linker research & development, or testing. We commonly need to intentionally set ELF metadata to undefined or even "wrong" values. However, it follows that elfedit won't prevent you from making nonsensical or otherwise incorrect changes to your ELF objects.

For example, X86 objects have little endian byteorder (ELFDATA2LSB):

% file /usr/bin/ls
/usr/bin/ls:    ELF 32-bit LSB executable 80386 Version 1 [FPU],
    dynamically linked, not stripped, no debugging information available
We can change the e_ident[EI_DATA] field in the ELF header from its proper value to ELFDATA2MSB, which reverses the byte order advertised by the program and makes it appear to be big endian:
% elfedit -e 'ehdr:ei_data elfdata2MSB' /usr/bin/ls /tmp/badls
% file /tmp/badls
/tmp/badls:     ELF 32-bit MSB executable 80386 Version 1 [FPU],
    dynamically linked, not stripped, no debugging information available
The file command sees the change that we made. However, we haven't really created a big endian X86 binary by changing what it advertises. We now have a little endian binary that is lying about what it contains. And of course, there is no such thing as a big endian X86 hardware, so if we had created such a binary, it wouldn't be runnable anywhere. It should come as no surprise that the system doesn't know what to do with our modified ls binary:
% /tmp/badls
/tmp/badls: cannot execute

This is really nothing to be worried about. If you are using elfedit's low level operations that allow arbitrary changes to individual ELF fields, then you need to know enough about the ELF format to make these changes properly. Most people will use elfedit for the high level operations such as changing runpaths. The high level operations are safe, and do not require expert knowledge to use.

If you are making those low level changes, the Solaris Linkers and Libraries Guide can be very helpful.

Learning More

elfedit is a standard part of the Solaris development branch, the code that will eventually ship from Sun as the next version of Solaris. It is also available as part of OpenSolaris. It is not part of Solaris 10 or earlier Solaris releases. If you are using a recent Solaris distribution, such as Solaris Express Developer Edition then elfedit should be already present on your system.

The elfedit(1) manpage describes the utility in more detail, and gives three examples that should be of general interest:

  1. Changing runpaths
  2. Changing hardware/software capability bits
  3. Reading specific data, without having to grep the output of elfdump.


Technorati Tag: OpenSolaris
Technorati Tag: Solaris

Comments:

Post a Comment:
Comments are closed for this entry.
About

I work in the core Solaris OS group on the Solaris linkers. These blogs discuss various aspects of linking and the ELF object file format.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Feeds