Making Code More Secure with GCC – Part 1

This blog entry was contributed by Maxim Kartashev

In today’s world, programming securely is the default option for any project. A program that doesn’t validate its input, contains buffer overruns or uninitialized variables, uses obsolete interfaces, etc. quickly becomes a liability.

Standards, best practices, and tools that help find security-related bugs and prevent them from creeping into code are in no short supply. There are, for instance, SEI CERT secure coding standards designed to be statically verifiable. And a multitude of static checkers such as Fortify, Coverity, Parfait, to name a few. They all help to make the code more secure, each at its own cost to developers, but inevitably involve tools that are generally foreign to the development process. The effort required to start using that software in your project varies and is never zero.

Authors of Coverity, a popular static program analyzer, formulated two laws of bug finding:

“Law #1: You can’t check code you don’t see. Law #2: You can’t check code you can’t parse”.

From the tool developer’s perspective, this means that the code analyzer must mimic the toolchain (compiler, linker, support libraries) that is used to build the program as closely as possible. Or, risk missing bugs and seeing what’s not there, which may result in false positives that so frequently throw us all off. On the other hand, the toolchain itself has many qualities of a program checker: the compiler can flag potential errors in the code, often at no additional cost to the user, the linker can help to find inconsistencies in inter-module calls and warn about the use of insecure and outdated interfaces, the run-time support libraries can do additional bookkeeping and help to locate accidental interface misuse.

This post starts a short series, in which I am going to explore the capabilities of the GNU 7.3 toolchain in the area of secure programming. I’ll focus on the power of the compiler as a static analyzer in this post.

GCC Static Analysis Options

Before generating executable code, the compiler has to perform a great deal of checks in order to make sure the program conforms to syntactic and semantic constraints of the language it is written in. For example, direct initialization of a 1-byte variable with an integer constant that doesn’t fit in it is an error in C++11 and, quite naturally, the error will be reported:

char c{0xFFFF}; // error: narrowing conversion of `65535` from
`int` to `char` inside { } [-Wnarrowing]

Some of those checks aren’t strictly required, but help to bring potential problems to the programmer’s attention. For example, using a different kind of initialization of the same variable with the same value is technically allowed, but still has the same chance of being an error on the programmer’s part. This is where compiler warnings come in:

char c(0xFFFF); // warning: overflow in implicit constant
conversion [-Woverflow]

gcc 7.3 has almost 150 distinct options that control warnings. Some are useful because they indicate unintended user errors, even if the language rules say nothing in that situation. Some are there to help to enforce certain guidelines that may or may not be employed by your project (for instance, -Weffc++). Fortunately, very few of those options need to be mentioned by name thanks to several “macro” options that enable many warnings at once. These are -Wall and -Wextra. Together, the two options control 50+ warnings, all which are useful, so it is a sensible default for any build.

Despite the name, -Wall doesn’t turn on all the warnings; neither does -Wextra. While they give diagnostics worth paying attention to, there could be an overwhelming amount of “unused variable” warnings at first. Those rarely indicate real problems in the code (but see below), so it might be a good idea to add -Wno-unused until all the other warnings have been dealt with. The proper solution to silencing the “unused” warnings is to add __attribute__((unused)) to those variables that are intentionally unused.

Note

Sometimes the warning about an unused variable hints at the real problem, so don’t turn them off forever. For example, in the following code, the “unused” warning indirectly points to the fact that the constructor’s parameter was used instead of the class member, which was obviously intended to be initialized in the constructor. This is the result of naming the constructor parameter the same as the class’s data member (may compilers be merciful to those adventurous souls who do such a thing).

struct A
{
    int field;
    A(int field) // warning: parameter `field` set but not used [-Wunused-but-set-parameter]
    {
        field = 42; // meant to initialize the data member, but set constructor's parameter instead
    }
};

Additional Help

The -Wall -Wextra warnings do not fully unleash the potential of gcc’s static analysis capabilities. To narrow the compiler’s focus on the program security, consider adding these options as well: -Wformat-security -Wduplicated-cond -Wfloat-equal -Wshadow -Wconversion -Wjump-misses-init -Wlogical-not-parentheses -Wnull-dereference

Here is why I consider those useful:

Options	Example
`-Wformat-security` or `-Wformat=2` Help to catch all kinds of format-string-related security holes. It always makes sense to keep either option on the command line.	void foo(const char* s) { printf(s); // warning: format not a string literal and no format arguments [-Wformat-security] }
`-Wduplicated-cond` Seems to always indicate a bug; for example, a misspelled comparison operator.	if ( p == 42 ) { return 1; } else if ( 42 == p ) { // warning: duplicated `if` condition [-Wduplicated-cond] return 2; }
`-Wfloat-equal` The result of comparing floating point numbers for equality is rarely predictable and therefore indicate a possible bug in the code. Consult this 1991 article titled “What Every Computer Scientist Should Know About Floating-Point Arithmetic” for an in-depth explanation of the reasons.	double d = 3; return d == 3; // warning: comparing floating point with == or != is unsafe [-Wfloat-equal]
`-Wshadow` This option helps to catch accidental misuse of variables from different scopes and is highly recommended.	int global = 42; int main() { char global = 'a'; // warning: declaration of 'global' shadows a global declaration [-Wshadow] // ... many lines later ... return global; // refers to the char variable, not ::global }
`-Wconversion` The rules of adjusting the value when changing its type are complex and sometimes counter-intuitive. This option helps to spot unintended value adjustments.	unsigned u = -1; // warning: negative integer implicitly converted // to unsigned type [-Wsign-conversion]
`-Wjump-misses-init` Unlike in C++, jumping past variable initialization is not an error in C, but is nevertheless dangerous.	switch(i) { case 10: foo(); int j = 42; case 11: // warning: switch jumps over variable initialization [-Wjump-misses-init] return j; default: return 42; }
`-Wlogical-not-parentheses` This option helps to find questionable – from the readability point of view – conditions that may or may not indicate a bug in the code.	if ( ! a > 1 ) // warning: logical not is only applied to the left hand // side of comparison [-Wlogical-not-parentheses]
`-Wnon-virtual-dtor` This option is specific to C++ and usually indicates a problem in the code, but is not sophisticated enough to guarantee the absence of false positives.	struct A // warning: `struct A` has virtual functions and non-virtual destructor { virtual void foo(); ~A(); }; void foo(A* a) { delete a; // warning: deleting object of polymorphic class type `A` which has // non-virtual destructor might cause undefined behavior [-Wdelete-non-virtual-dtor] }
`-Wnull-dereference` This is a very useful warning with little to no false positives, but it requires the-fdelete-null-pointer-checks option, which is enabled by optimizations for most targets.	void foo(int* p) { p = 1; // warning: null pointer dereference [-Wnull-dereference] } int main() { int p = 0; foo(p); }

Higher Optimization Means Better Analysis

In order to be helpful with some of those warnings (like, for example, -Wnull-dereference and -Wstringop-overflow), the compiler needs to collect and analyze various kinds of information about the program. Some types of analysis are only performed at higher optimization levels, which is why it is advisable to compile at least with -O1 to get better diagnostics.

For example:

#include 
  
 
int main(int argc, char *argv[])
{
    char buf[4];
    const char *s = argc > 10 ? argv[0] : "adbc"; // "s" may require 5 or more bytes
 
    strcpy(buf, s); // there's only room for 3 characters and the terminating 0 byte in buf
}

With the default optimization level – implying no optimization at all – you get no warnings:

$ gcc   -Wall -Wextra  a.c

But with -O1, the problem gets spotted:

$ gcc   -Wall -Wextra -O1 a.c

a.c: In function ‘main’:

a.c:8:5: |*warning*|: ‘strcpy’ writing 5 bytes into a region of size 4 overflows the destination [-Wstringop-overflow=]

     strcpy(buf, s);

     ^~~~~~~~~~~~~~

Inter-module Checks

Even the highest optimization level cannot compensate for lack of information: the compiler is usually given one compilation unit (CU) at a time, making cross-checks between CUs impossible. There’s a solution, though: the -flto option. It works with the linker’s help and can spot otherwise very hard-to-find bugs.

In this example, a function is defined as char foo(int) in one file, but declared int foo(int) in another:

a.c

char foo(int i)
{ /* ... */ }

b.c

extern int foo(int);
 
typedef int (*FUNC)(int);
 
int main()
{
    FUNC fp = &foo;
    int i = fp(1); // foo() actually only returns 1 byte, while we read sizeof(int) here
    return i; // may return garbage
}

Notice the difference in the size of return types; when this function is called by the CU that only sees the latter declaration, it can end up reading uninitialized memory (3 bytes more than the function actually returns). Only the final link step with -flto can help to catch this:

$ gcc -flto -c a.c b.c # no warnings

$ gcc -flto a.o b.o
b.c:3:12: |*warning*|: type of ‘foo’ does not match original declaration [-Wlto-type-mismatch]
 extern int foo(int);
            ^
a.c:1:6: note: return value type mismatch
 char foo(int i)
      ^
a.c:1:6: note: type ‘char’ should match type ‘int’
a.c:1:6: note: ‘foo’ was previously declared here

As you can see, -flto has enabled gcc to compare declaration and definition of the function and find that they aren’t really compatible.

Key Take-Aways

To make your gcc-compiled program more secure:

Always add -Wall -Wextra to the gcc command line to get an ever-expanding set of useful diagnostics about your program.
- Add -Wno-unused if the amount of messages regarding unused variables is overwhelming; consider using __attribute__((unused)) later.
Don’t forget these additional options help to make the code even more secure: -Wformat-security -Wduplicated-cond -Wfloat-equal -Wshadow -Wconversion -Wjump-misses-init -Wlogical-not-parentheses -Wnull-dereference
Compile with optimization (-O1 or higher) to enable the compiler to issue better diagnostics and help to find real bugs in the code.
Use the latest possible gcc; each new major version adds dozens of new checks and improves existing ones.

What’s Next

Static program analysis is always the result of a trade-off between the quality of real bugs it finds and quantity of false positives. In other words, not all true bugs are found and reported at compile time. Which is why keeping your eyes open at run time is also important and the GNU compiler can help with that, too. gcc is capable of adding checks to the code that it generates (“sanitizing” it), thus enabling automatic bug detection at run time. This compiler feature can help to find bugs that completely escape static analysis. I also plan to look at the built-in debugging capabilities of support libraries the GNU toolchain provides.

References

SEI CERT Coding Standards for C, C++, Java, and Perl.
A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World – an article from the creators of Coverity.
A complete list of gcc 7.3 options with description.
Difference in gcc options between versions that shows the amount of new kinds of analysis each new gcc version adds.
Parfait – Oracle Labs static program analysis tool.