Making Code More Secure with GCC – Part 2

This blog entry was contributed by Maxim Kartashev

In the previous post I focused on the static analysis capabilities of the gcc 7.3 compiler. Warnings issued at compile time can point to the place in a program where an error at run time might occur, thus enabling the programmer to fix the program even before it is run. Not all run time errors can be predicted at compile time, though, and there are good and bad reasons why. For instance, there might be many annoying false positive warnings that get routinely ignored (and sometimes rightly so), until that time when one of them points to the actual problem, but gets silenced together with the rest. Or the programmer invokes undefined behavior, which in many cases is impossible to diagnose at compile time because there are simply no provisions for that in the programming language.

The GNU toolchain continues to help the programmer even past compile time with the help of code instrumentation and additional features baked into the glibc library. In this post I am going to describe the necessary steps to utilize these capabilities.

Apart from flaws in the program that make it work incorrectly even on correct data, an attacker will attempt to create input unforeseen by the programmer in order to take control over the program. And again, gcc can help to strengthen the code it generates by structuring it differently and providing additional checks. This post list several most useful techniques that gcc 7.3 implements.

Finding Bugs At Run Time

Some compiler warnings can be legitimately – from the point of view of the language – suppressed. One example is shown below: an explicit type cast spelled out in the code makes the compiler believe that you know what you are doing and not complain.

a.c

	
int global;
int main()
{
    int* p = &global;
    long* lp = p;
 
    long l1 = *lp; // warning: initialization from incompatible 
                   // pointer type [-Wincompatible-pointer-types]

    long l2 = *(long*)p; // same as above, but no warning
}

$ gcc -fsanitize=undefined a.c
a.c: In function ´main´:
a.c:5:16: warning: initialization from incompatible pointer type
          [-Wincompatible-pointer-types]

     long* lp = p;
                ^

These kinds of tricks place the program into the undefined behavior territory meaning that it is no longer predictable what the program will do. It is often tempting to dismiss the severity of the undefined behavior; in fact, not many situations really lead to unpredictable results at low optimization levels. The danger increases tenfold with the high -O settings because the undefined behavior starts to break compiler’s understanding of the program and, guessing incorrectly, the compiler can generate code that does peculiar things. As an example, see how undefined behavior can erase your hard disk.

Fortunately, the gcc compiler can still help to find at least some kinds of undefined behavior situations. It can be asked to instrument the generated code with additional instructions that would perform various checks before actual user code gets executed. To enable this instrumentation, use the -fsanitize=undefined option when compiling and linking your program. When executed, the program will report problems spotted as “runtime errors”. See, for instance, how the GNU toolchain detects two bugs in the above code at run time:

  
$ ./a.out
a.c:9:10: runtime error: load of misaligned address 0x0000006010dc for 
          type 'long int', which requires 8 byte alignment

0x0000006010dc: note: pointer points here
  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^

a.c:9:10: runtime error: load of address 0x0000006010dc with insufficient 
          space for an object of type 'int'

0x0000006010dc: note: pointer points here
  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00

The -fsanitize option has many sub-options. If you are interested in finding out which specific situation can be detected by the version of the GNU toolchain you are using, check the Program Instrumentation Options section of its documentation.

By default, the first error aborts the program giving you an opportunity to debug the core file, but it is also possible to attempt to continue execution in order to catch more error at once. This is what the -fsanitize-recover=undefined compiler option does; remember, though, that errors can cascade and all but the first one may not be very useful.

Memory Corruption Mitigation

Memory corruption is perhaps the most common source of subtle bugs and vulnerabilities. Unsurprisingly, many tools exist to help the programmer to find the origin of the problem (memcheck, discover, etc, etc). The GNU toolchain has not one but two such technologies: run-time program instrumentation (“AddressSanitizer”) and, independent from it, built-in checks of the glibc dynamic memory allocator.

AddressSanitizer

The gcc compiler can instrument memory access instructions so that out-of-bounds and use-after-free bugs can be detected. This method requires recompilation with the -fsanitize=address option and obviously produces code that runs slower than without instrumentation (expect ~x2 slowdown). When compiling with optimization, the -fno-omit-frame-pointer is recommended since the sanitizer runtime uses fast and simple frame-based stack unwinder that requires the frame pointer register to serve its primary function. At run time, a detailed error message will be issued to stderr complete with the stack traces at the time of the invalid access and allocation of the memory block (if it was in the heap). Many find it helpful to not abort on first error; the -fsanitize-recover=address option enables this.

Here’s an example of the sanitizer output from this code:

a.c

// ...
char* p = malloc(2);
p[2] = 0; // writes past the allocated buffer
// ...

$ gcc -fsanitize=address a.c
$ ./a.out
=================================================================
==27056==ERROR: AddressSanitizer: heap-buffer-overflow on 
         address 0x619000000480 at pc 0x000000400726 bp 0x7fffffffd910 

WRITE of size 1 at 0x619000000480 thread T0
    #0 0x400725 in main (/tmp/a.out+0x400725)
    #1 0x7ffff6a7d3d4 in __libc_start_main (/lib64/libc.so.6+0x223d4)
    #2 0x400618  (/tmp/a.out+0x400618)

0x619000000480 is located 0 bytes to the right of 1024-byte 
region [0x619000000080,0x619000000480)
allocated by thread T0 here:
    #0 0x7ffff6f01900 in __interceptor_malloc /.../asan_malloc_linux.cc:62
    #1 0x4006d8 in main (/tmp/a.out+0x4006d8)
    #2 0x7ffff6a7d3d4 in __libc_start_main (/lib64/libc.so.6+0x223d4)

SUMMARY: AddressSanitizer: heap-buffer-overflow (/tmp/a.out+0x400725) in main
Shadow bytes around the buggy address:
...
  0x0c327fff8080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c327fff8090:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
...

This method works not only on dynamically allocated memory, but on stack (local, automatic) variables and global statically allocated data. Many aspects of the address sanitizer’s work are controlled with the ASAN_OPTIONS environment variable, including what to check and what to report. For example, specifying ASAN_OPTIONS=log_path=memerr.log will redirect all output to the file named memerr.log.<pid> instead of stderr. See the complete option reference.

Dynamic Memory Checks by glibc

The glibc dynamic memory allocator can perform heap consistency checks and report problems to stderr supplied with the stack trace and memory map at the time of error, if requested. To utilize this capability, set the MALLOC_CHECK_ environment variable (values control what to do on error and can be found in mallopt(3)) prior to running the program. You can also insert explicit heap checks by either linking with -lmcheck option or calling the mcheck(3) function before the first call to malloc(3). All the specifics can be found in the mcheck(3) man page. This is an example of this facility:

a.c

char* p = malloc(n);
// ...
if ( argc == 1 ) {
    free(p);
}
// ...
free(p);

No additional compilation options are required:

$ gcc a.c

$ MALLOC_CHECK_=3 ./a.out 
*** Error in `./a.out': free(): invalid pointer: 0x0000000000602010 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x8362e)[0x7ffff7a9162e]
./a.out[0x40058b]
...
Aborted (core dumped)

The types of problems found by these checks are limited to heap metadata corruption (heap buffer overruns) and things like double free. Still, the method requires neither changes to the code nor recompilation, has lower performance impact than AddressSanitizer described above, and can be used to abort the program to ease debugging, all of which make it a useful tool in keeping your program clear from dynamic memory corruption.

Options to Increase Code Security

The GNU compiler implements several techniques to harden the program against possible attacks. They work by inserting small bits of code and/or by adding checks to some standard functions (strcat(3), for instance) that verify the integrity of the vital data at run time and abort the program if the data get damaged, which may be the result of a programming error or attempted attack. All these options are aimed at being enabled for production builds.

The -fstack-protector option adds protection against stack smashing attacks by placing a few guarding bytes to the vulnerable (see below) function’s stack and verifying that those bytes haven’t been changed before returning from the function. If they have, an error is printed and the program aborts:

*** stack smashing detected ***: ./a.out terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x7ffff7b26677]
/lib64/libc.so.6(+0x118632)[0x7ffff7b26632]
./a.out[0x400589]
./a.out[0x400599]
...

By default, only functions with call alloca(3) and functions with buffers larger than 8 bytes are so protected by the option. There are several choices as to which functions to consider vulnerable and protect:

-fstack-protector-strong protects also those that have local array definitions or have references to local frame addresses,
-fstack-protector-all protects all functions, and
-fstack-protector-explicit only protects those with the stack_protect attribute, which you need to add manually.

Another compiler option that helps to protect against the stack-tampering attacks is -fstack-check. When a single-threaded program goes beyond its stack boundaries, the OS generates a signal (typically SIGSEGV) that terminates the program. With multi-threaded – and, therefore, multi-stack – programs, such situation is not so easily detectable because one thread’s stack’s bottom might be another stack’s top and the gap between them (protected by the OS) is small enough so that it can be “jumped” over. The -fstack-check option will help to mitigate that and make sure the OS knows when a stack is being extended and by how many pages even if the attacker makes it so that the program doesn’t touch every page of the newly extended stack. The result is the OS-guarded canary between different thread’s stacks is guaranteed to get touched and the multi-threaded program receives the same neat terminating signal as with an offending single-threaded program.

The next code hardening technique gets activated by defining the _FORTIFY_SOURCE macro to 1 (check without changing semantics) or 2 (more checking, but conforming programs might fail) and provides protection against silent buffer overruns by functions that manipulate strings or memory such as memset(3) or strcpy(3). Precise information for your version of the toolchain can be found in the feature_test_macros(3) man page.

As I have mentioned in my previous post, many compiler checks benefit from an increased level of optimization that allows gcc to collect more data about the program. The use of _FORTIFY_SOURCE macro requires the optimization level of -O1 or above. Potential errors are detected both at run and compile time when possible.

Consider this example:

a.c

	
#include <string.h>
 
int main(int argc, char* argv[])
{
    char s[2];
    strcpy(s, "a.out"); // buffer overrun here
    return 0;
}

Compiling it with the usual flags doesn’t spot any problems, even though obviously the "a.out" string doesn’t fit into the two bytes available in the local variable s:

$ gcc -O2 -Wall -Wextra -Wno-unused a.c

Even running the problem gives no hints to possible troubles:

$ ./a.out
$ echo $?
0

Let’s add the _FORTIFY_SOURCE macro:

$ gcc -D_FORTIFY_SOURCE=1 -O2 -Wall -Wextra -Wno-unused a.c
In file included from /usr/include/string.h:638:0,
                 from a.c:1:
In function ´strcpy´,
    inlined from ´main´ at a.c:6:5:
/usr/include/bits/string3.h:104:10: warning: ´__builtin___strcpy_chk´ 
    writing 6 bytes into a region of size 2 overflows the destination 
    [-Wstringop-overflow=]

   return __builtin___strcpy_chk (__dest, __src, __bos (__dest));
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

And we immediately get a warning from the compiler.

Now let’s see what happens if the source string being copied is not a compile-time constant:

a.c

#include <string.h>
 
int main(int argc, char* argv[])
{
    char s[2];
    strcpy(s, argv[0]); // buffer overrun here as argv[0] will be "./a.out"
    return 0;
}

Notice that this time there are no warnings:

$ gcc -D_FORTIFY_SOURCE=1  -O2 -Wall -Wextra -Wno-unused a.c

At run time, however, the support library has detected the buffer overrun and immediately aborted execution of the program.

$ ./a.out 
*** buffer overflow detected ***: ./a.out terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x7ffff7b26677]
/lib64/libc.so.6(+0x1167f2)[0x7ffff7b247f2]
./a.out[0x40052d]
...

Aborted (core dumped)

Key Take-Aways

The GNU toolchain can be utilized to find bugs and vulnerabilities at run time:

Compile your program with the -fsanitize=undefined gcc option and run your tests. This will exercise a lot of additional checks that will help to ensure the program actually behaves as intended and doesn’t do so simply by accident.
Both suspected and unsuspected problems with the heap can often be detected by setting the MALLOC_CHECK_ environment variable prior to running your program (see mallopt(3) for more info). No re-compilation required!
If recompiling is possible, all kinds of memory access problems can be detected by AddressSanitizer: compile with -fsanitize=address (adding -O -fno-omit-frame-pointer to reduce the negative performance impact and, possibly, -fsanitize-recover=address to not abort on first error).

The GNU toolchain can also harden your program against certain kinds of attacks:

The -fstack-protector option adds stack integrity checks to certain vulnerable functions. You can control which functions to protect with sub-options.
Use -fstack-check for multi-threaded programs to prevent one thread from silently extending its stack on top of another.
Add -D_FORTIFY_SOURCE=1 -O2 to your compilation flags to catch buffer overruns by certain standard memory manipulation functions both at run and compile time. See feature_test_macros(3) for more info.

References

List of gcc options for program instrumentation (-fsanitize= and friends).
The complete list of gcc options with descriptions.
glibc built-in heap consistency checks.

Making Code More Secure with GCC – Part 2

Finding Bugs At Run Time

Memory Corruption Mitigation

AddressSanitizer

Dynamic Memory Checks by glibc

Options to Increase Code Security

Key Take-Aways

References

Elena Zannoni

Writing kernel tests with the new Kernel Test Framework (KTF)

How to Install Node.js 10 with node-oracledb and Connect it to Oracle Database

Making Code More Secure with GCC – Part 2

Finding Bugs At Run Time

Memory Corruption Mitigation

AddressSanitizer

Dynamic Memory Checks by glibc

Options to Increase Code Security

Key Take-Aways

References

Authors

Elena Zannoni

Writing kernel tests with the new Kernel Test Framework (KTF)

How to Install Node.js 10 with node-oracledb and Connect it to Oracle Database