Improving GCC Buffer Overflow Detection for C Flexible Array Members

GNU Logo

This blog entry was contributed by Qing Zhao. Qing works on GCC in the Oracle Linux Toolchain Team, has been participating to the KSPP project for several years, and contributed several security features to GCC.

Introduction

In this blog we will take a close look at the recent GCC additions, done by Qing Zhao, of a new attribute and a new builtin function, which further improve the detection of buffer overflows. Readers will learn about the motivation behind this work and how to use the new features via a number of examples. Various related caveats and rules are also discussed. Finally we will look at the state of the adoption of these new features in the Linux kernel and beyond.

As we have described in our previous blog entry on the topic of code security, “Closing A Hole In The Detection Of Buffer Overflows With GCC“, with the help of the new GCC options -fstrict-flex-arrays and -Wstrict-flex-arrays, the classes of flexible arrays not conforming to C99 (the so called “fake flexible arrays”), can be identified and converted to fixed-size arrays for which bound checking is feasible: these arrays are protected completely by the current buffer overflow detection tools in GCC.

The question remains: what about the dynamically-sized arrays whose size information is not in the source code, such as the C99 flexible array members (FAMs) and pointer offsets? There is no way to compute their size and, therefore, determine if there is an out of bound access. What can we do to tell the compiler about their size to enable full protection by the buffer overflow detection tools?

This blog will focus on providing an answer to the above question about flexible array members. We describe here two new GNU extensions which specify size information for FAMs. These are a new attribute, “counted_by” and a new builtin function, “__builtin_counted_by_ref“. Both extensions can be used in GNU C applications to specify size information for FAMs, improving the buffer overflow detection for FAMs in general.

This work has been done in coordination with the Kernel Self Protection Project (KSPP) and has been added to GCC15, which was released in April 2025. Any application can adopt this approach to make the code more robust and secure. In the real world, the Linux kernel has extensively adopted these new extensions to annotate the FAMs in the source code with corresponding size information. Such new size information greatly helps the buffer overflow detection mechanisms provided by the GCC compiler to catch more dangerous buffer overflow flaws in the Linux kernel.

Flexible Array Members (FAMs)

Flexible array members (FAMs) are one class of dynamically-sized arrays whose size is not known at compile time.

FAMs were formalized and added to the ISO C99 standard in May 2000. A FAM is declared as an array without a dimension and with no specified size (denoted by a[]), along with the following two conditions:

The array should be inside a structure and be declared as the last member of the structure.
The structure must contain at least one more named member in addition to the flexible array member.

The following is a complete small example of usage of a flexible array member:


$ cat simple-fam.c
struct A {
  unsigned count;
  int buf[]; 
};

static struct A __attribute__ ((noinline)) *alloc_buf (unsigned sz)
{
  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(int));
  obj->count = sz;
  return obj;
}

int main ()
{
  struct A *p = alloc_buf (10);
  p->buf[12] = 22;	// out-of-bound access
  return 0;
}

Above we can see that the structure A includes a trailing FAM field buf, whose number of elements is not specified explicitly in the source code. Each object of this structure is allocated through the routine alloc_buf(), in which the size of the allocation is computed based on the known size of the fixed portion of the structure A, plus the required size for the FAM member buf whose number of elements is passed in the parameter sz. Inside the routine alloc_buf(), the number of elements of the array buf, sz is recorded into the other field of the structure A, obj->count.

Inside the function main(), p is a pointer to an object of type structure A, with 10 elements in its FAM field buf. Space for the struct is allocated first, followed by an out-of-bound access to the array buf.

With GCC 14 (or earlier) and the bounds sanitizer (-fsanitize=bounds), the out-of-bound access to this flexible array cannot be detected because the bound of the array is not known, the program runs without reporting any errors.


  $ gcc --version
    gcc (GCC) 14.3.0
    [...]
  $ gcc -O -fsanitize=bounds simple-fam.c
  $ ./a.out
  $

A similar case, which uses the builtin function __builtin_dynamic_object_size() on FAMs would result in undetected buffer overflows. Remember the example t4.c in the previous blog, reproduced here:


$cat t4.c
#include <string.h>
struct trailing_array {
 int b;
 char test[8];
};

#undef memset
#define memset(dest, src, n) \
 __builtin___memset_chk (dest, src, n, __builtin_dynamic_object_size (dest, 1))

void __attribute__ ((noinline)) my_memset (struct trailing_array *p, int n)
{
 memset(p->test, 'b', n);
 return;
}

int main ()
{
 struct trailing_array t_a;
 my_memset(&t_a, 10);
 __builtin_printf("Pass \n");
 return 0;
}

The __builtin_dynamic_object_size function returns (simply put) the size of its first argument, and it is used to write a hardened version of memset(), which checks that that the size of what it being set (the “destination”), is large enough to contain the number of bytes to be set as specified. When compiling the example t4.c with -fstrict-flex-arrays=3, the array p->test is treated as a fixed array, and its size is known, therefore the buffer overflow is detected successfully (the array p->test is 8 chars, but we access and set 10 chars with the my_memset() function call):

  $ gcc --version
    gcc (GCC) 14.3.0
    [...]
  $ gcc -O-fstrict-flex-arrays=3 t4.c; ./a.out
  *** buffer overflow detected ***: ./a.out terminated

However, if we change the array test[8] to a FAM, as test[] in the following, the behavior of the program is different.


$ cat t4-1.c
#include <string.h>
struct trailing_array {
 int b;
 char test[];
};

#define MAX(a, b) ((a) > (b) ? (a) : (b))

#undef memset
#define memset(dest, src, n) \
 __builtin___memset_chk (dest, src, n, __builtin_dynamic_object_size (dest, 1))

void __attribute__ ((noinline)) my_memset (struct trailing_array *p, int n)
{
 memset(p->test, 'b', n);
 return;
}

static struct trailing_array * __attribute__ ((noinline))
setup (int length)
{
  size_t size = MAX (sizeof (struct trailing_array),
    (__builtin_offsetof (struct trailing_array, test[0])
      + (length) * sizeof (char)));
  struct trailing_array *p = (struct trailing_array *) __builtin_malloc (size);
  p->b = length;
  return p;
}

int main ()
{
 struct trailing_array *t_a = setup (8);
 my_memset(t_a, 10);
 __builtin_printf("Pass \n");
 return 0;
}

Even though the FAM array p->test has size 8, and the my_memset triggered a buffer overflow, the overflow is not detected by the compiler due to the size of the FAM not being known.

  $ gcc --version
    gcc (GCC) 14.3.0
    [...]
  $ gcc -O t4-1.c; ./a.out
  Pass

Let’s see how the new additions of counted_by attribute and the new builtin to GCC can improve both situations above, allowing to detect these missed buffer overflows.

Inform the compiler about the size of a FAM

For a well-described structure with a FAM field, there is usually another field in the same structure that holds the element count for the FAM field (such as the field count in the simple-fam.c example). Associating this field holding the element count with the FAM field is a natural way to tell the compiler about the size of the FAM field, but until recently there was no way to do so.

In GCC15, we introduced a new GNU C variable attribute, counted_by, to enable programmers to annotate the FAM field with its associated size information. counted_by (count) is defined in the GCC manual as:

“The counted_by attribute may be attached to the C99 flexible array member of a structure. It indicates that the number of the elements of the array is given by the field count in the same structure as the flexible array member.”

For instance, with this new attribute, the previous structure A in simple-fam.c becomes:

  
struct A {
  unsigned count;
  int buf[] __attribute__ ((counted_by (count)));
};

Adding this attribute specifies that the number of elements of the FAM buf is given by the field count in the same structure. (In the remainder of the blog, we call the field count the “counted-by field” for simplicity).

When the number of the elements of the FAM field is made known to the compiler by the counted_by attribute, the dynamic buffer overflow detection tools, such as the bounds sanitizer and the __builtin_dynamic_object_size builtin can be improved to detect an out-of-bounds access to the FAM field, or the object size of the FAM field, respectively.

We slightly change the previous example simple-fam.c by just adding a counted_by attribute to the FAM field buf:


$ cat countedby-fam.c
struct A {
  unsigned count;
  int buf[] __attribute__ ((counted_by (count)));
};

static struct A __attribute__ ((noinline)) *alloc_buf (unsigned sz)
{
  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(int));
  obj->count = sz;
  return obj;
}

int main ()
{
  struct A *p = alloc_buf (10);
  printf("Size of buf is: %d\n", __builtin_dynamic_object_size(p->buf, 1));
  p->buf[12] = 22;      // out-of-bound access
  return 0;
}

Thanks to the new attribute counted_by, the out-of-bound access in the source code is now caught successfully by the bounds sanitizer, and the size of the array buf is known:


$ gcc -O -fsanitize=bounds countedby-fam.c; ./a.out
Size of buf is: 10
countedby-fam.c:15:9: runtime error: index 12 out of bounds for type 'int [*]'
$

If we modify the source code of our previous example t4-1.c by adding the counted_by attribute (t4-2.c), we can see how the __builtin_dynamic_object_size now helps detecting the overflow.


$ cat t4-2.c
#include <string.h>
struct trailing_array {
 int b;
 char test[] __attribute__ ((counted_by (b)));
};

#define MAX(a, b) ((a) > (b) ? (a) : (b))

#undef memset
#define memset(dest, src, n) \
 __builtin___memset_chk (dest, src, n, __builtin_dynamic_object_size (dest, 1))

void __attribute__ ((noinline)) my_memset (struct trailing_array *p, int n)
{
 memset(p->test, 'b', n);
 return;
}

static struct trailing_array * __attribute__ ((noinline))
setup (int length)
{
  size_t size = MAX (sizeof (struct trailing_array),
    (__builtin_offsetof (struct trailing_array, test[0])
      + (length) * sizeof (char)));
  struct trailing_array *p = (struct trailing_array *) __builtin_malloc (size);
  p->b = length;
  return p;
}

int main ()
{
 struct trailing_array *t_a = setup (8);
 my_memset(t_a, 10);
 __builtin_printf("Pass \n");
 return 0;
}

We get, with GCC15:


 $ gcc -O t4-2.c; ./a.out
 *** buffer overflow detected ***: ./a.out terminated
 Aborted (core dumped)

You can see that with ONLY a counted_by attribute added, the buffer overflow can be detected during runtime.

Considerations on the use of the counted_by attribute to avoid pitfalls

There are some complexities that programmers need to keep in mind when using this attribute.

One important requirement is that the counted-by field must have an integer type. For instance, all the following usages are legal:


struct fam_1 {
  unsigned char count1;
  int array_1[] __attribute__ ((counted_by (count1)));
};

struct fam_2 {
  int count2;
  int array_2[] __attribute__ ((counted_by (count2)));
};

struct fam_3 {
  _Bool count3;
  int array_3[] __attribute__ ((counted_by (count3)));
};

enum week {Mon, Tue, Wed};
struct fam_4 {
  enum week count4;
  int array_4[] __attribute__ ((counted_by (count4)));
};

On the other hand, the following is not valid, and the compiler will issue an error:


struct fam_5 {
  float count;
  int array_5[] __attribute__ ((counted_by (count))); 
 /*Compiler error: attribute is not a field declaration with an integer type*/
};

It would be natural to expect the counted-by fields to be unsigned, since negative counts do not make a lot of sense, however, in existing applications such as the Linux kernel, there are many counters implemented as having signed integer type. It becomes an additional refactoring burden to change these existing counters to be unsigned integer type. Furthermore, some of the signed integer counters are used with negative values in some other parts of the code for various reasons. This adds more overhead to correctly refactor the existing code in order to adopt the counted_by attribute. (Please see the original request of this feature)

In order to ease the adoption of the counted_by attribute into existing applications, GCC allows the counted-by field to be of signed integer type and permits it to be assigned a negative value in this case.

When the counted_by field has a signed integer type and is assigned a negative value in the program, the compiler treats the negative value as zero.

Even though the above situation is accepted, we do not encourage such programming style in general since it is very confusing and error-prone.

The following short example countedby-int.c demonstrates this behavior that may be surprising at first sight. It includes two small changes to countedby-fam.c.c. First, the type of the counted_by field is changed from unsigned to int; second, the routine alloc_buf() does not allocate space for the FAM field buf. As a result, the FAM field buf has zero elements. In the main routine, a negative value, -10, is passed to the routine alloc_buf() to set the counted_by field to this value.

As noted, the result of the call to __builtin_dynamic_object_size(p->buf, 1) is 0, which correctly reflects the size of the FAM field p->buf.


$ cat countedby-int.c
struct A {
  int count;
  int buf[] __attribute__ ((counted_by (count)));
};

static struct A __attribute__ ((noinline)) *alloc_buf (unsigned sz)
{
  struct A *obj = __builtin_malloc (sizeof(struct A));
  obj->count = sz;
  return obj;
}

int main ()
{
  struct A *p = alloc_buf (-10);
  printf("Size of buf is: %d\n", __builtin_dynamic_object_size(p->buf, 1));
  p->buf[0] = 22; 	// out-of-bound access
  return 0;
}


$ gcc -O countedby-int.c -fsanitize=bounds; ./a.out
Size of buf is: 0
countedby-int.c:14:9: runtime error: index 0 out of bounds for type 'int [*]'
$

A second area in which the programmer must be very careful is the following.

For any object having a structure type that includes a FAM field qualified by a counted_by attribute such as *p in the above countedby-int.c, there is a pair of sub-objects i.e. a FAM object p->buf, and a size object p->count. This pair of sub-objects must meet two requirements:

The size object p->size must be initialized before the first reference to the FAM object p->buf;
The FAM object p->buf must have at least the number of elements specified by the size object p->size at all times.

Failing to meet either of the above 2 might result in undefined behavior of the buffer overflow detection tools.

For example, let’s modify the countedby-int.c example by making the count field an unsigned int, deleting the counted_by field initialization and changing the out-of-bound access to an in-bounds access as follows:


$ cat countedby-noinit.c
struct A { 
  unsigned count;
  int buf[] __attribute__ ((counted_by (count))); 
};

static struct A __attribute__ ((noinline)) *alloc_buf (unsigned sz)
{
  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(int));
  return obj;
}

int main ()
{
  struct A *p = alloc_buf (10);
  p->buf[4] = 22;      // now this is an in-bound access
  return 0;
}


$ gcc -O -fsanitize=bounds countedby-noinit.c; ./a.out
countedby-noinit.c:14:9:runtime error: index 4 out of bounds for type 'int [*]'

Here we can see that even though the array access “p->buf[4]” is an in-bound access, the counted-by field is not initialized correctly before the array access, which results in the bounds sanitizer incorrectly reporting an error.

On the other hand, if the counted-by field is set to a value that is larger than the real number of elements in the FAM field, the bounds sanitizer or the __builtin_dynamic_object_size will behave incorrectly too.

Consider the following example:


$ cat countedby-larger.c
struct A { 
  unsigned count;
  int buf[] __attribute__ ((counted_by (count))); 
};

static struct A __attribute__ ((noinline)) *alloc_buf (unsigned sz)
{
  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(int));
  obj->count = sz + 10;
  return obj;
}

int main ()
{
  struct A *p = alloc_buf (10);
  p->buf[12] = 22;      // out-of-bound access
  return 0;
}


$ cc -O -fsanitize=bounds countedby-larger.c
$ ./a.out
$

When the value of p->count is larger than the real number of elements of the corresponding FAM field p->buf, the bounds sanitizer will not catch the out-of-bound access during run-time.

Another potential mistake the programmer might make is to accidentally reset the value of the counted-by field in other parts of the code even though it was initialized correctly. For example:


$ cat countedby-reset.c
struct A { 
  unsigned count;
  int buf[] __attribute__ ((counted_by (count))); 
};

static struct A __attribute__ ((noinline)) *alloc_buf (unsigned sz)
{
  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(int));
  obj->count = sz;
  return obj;
}

int main ()
{
  struct A *p = alloc_buf (10);
  p->count += 10;
  p->buf[12] = 22;      // out-of-bound access
  return 0;
}


$ gcc -O -fsanitize=bounds countedby-reset.c
$ ./a.out
$

Then the bounds sanitizer cannot detect this out-of-bound error anymore.

In summary, it is very important to meet all the requirements for the counted_by attribute when adopting it in your applications.

In a future GCC release we might provide additional diagnostics by issuing compiler warnings when the above requirements are not met, to help the users to avoid such mistakes.

A new builtin is needed: __builtin_counted_by_ref()

Now that the new attribute counted_by has been added to GCC, it has been extensively adopted by the Linux kernel. There are about 450 FAMs with counted-by field in the Linux kernel source tree currently.

However, during the adoption of counted_by in the Linux kernel, it was realized that the allocations of flexible arrays do not use any common syntactic construct (function or macro) to avoid repetitions and mistakes. Something similar to the following macro would be needed to avoid repeating the same steps at each initalization (note that this is a simplified macro for illustrative purposes only):


#define alloc_flex(P, FAM, COUNT) ({ \
 size_t __size = compute_size(P,FAM,COUNT); \
 __auto_type __p = &(P); \
 *__p = alloc(__size); \
})

In it, the routine compute_size is used to calculate the size of the object that is pointed to by the pointer P and whose type is a structure type including a FAM field with the number of elements in this FAM field being COUNT.

In the case of a structure A with counted-by field count:


struct A {
  size_t count;
  int buf[] __attribute__((counted_by (count)));
} *p1, *p2, ... *pn;

With the simplification of the code by using the alloc_flex macro, each object of structure A would be allocated in different places like so:


p1 = alloc_flex(p1, buf, count1);

p2 = alloc_flex(p2, buf, count2);

pn = alloc_flex(pn, buf, countn);

However, even with the use of the macro, in order to correctly set the counted-by field along with the memory allocation to the whole structure, an initialization of it should be added after each allocation, such as:


p1 = alloc_flex(p1, buf, count1);
p1->count = count1;

p2 = alloc_flex(p2, buf, count2);
p2->count = count2;

pn = alloc_flex(pn, buf, countn);
pn->count = countn;

This would be a time-consuming and error-prone task for large applications, as it would be easy to forget to add such counted-by field initialization. It would be much simpler and safer if the initialization could be done inside the alloc_flex macro itself. This would require to access the counted-by field of the structure from inside the macro, by some different mechanism.

A new builtin function, __builtin_counted_by_ref, was deemed useful and generic enough to be added into GNU C to resolve this issue. It is defined as:

“Built-in Function: type __builtin_counted_by_ref (ptr) The built-in function __builtin_counted_by_ref checks whether the array object pointed by the pointer ptr has another object associated with it that represents the number of elements in the array object through the counted_by attribute (i.e. the counted-by object). If so, returns a pointer to the corresponding counted-by object. If such counted-by object does not exist, returns a null pointer.”

The behavior of this new builtin is shown below in the two possible cases:


struct A { 
  size_t count; 
  int buf[] __attribute__((counted_by(count))); 
} *obj1;

In the case above, there is a counted-by field in the structure A, and a call __builtin_counted_by_ref (obj1->buf) returns the address of such field (&obj1->count).

And:


struct B {
  size_t count;
  char buf[];
} *obj2;

In the case above, structure B doesn’t have a counted-by field, and a call __builtin_counted_by_ref (obj2->buf) returns the NULL pointer ((void*)NULL).

With the availability of this new builtin function, the above object allocator macro alloc_flex could be updated to:


#define alloc_flex(P, FAM, COUNT) ({ \
  size_t __size = compute_size (P, FAM, COUNT); \
  __auto_type __p = &(P); \
  *__p = alloc(__size) \
  __auto_type ret = __builtin_counted_by_ref((*__p)->FAM); \
  *_Generic(ret, void *: &(size_t){0}, default: ret) = (COUNT); \
 })

The additional counter initialization for each allocation is no longer needed, because the counted-by field is accessible from within the macro itself, via the new builtin. The macro allocates the array, like previously, but it now includes a call to the new builtin to retrieve the address of the counter and it assigns the value COUNT to the counted-by field. It uses the _Generic keyword to implement the two distinct situations illustrated above based on the type of the result ret. If __builtin_counted_by_ref returns a void* type it means there is no counted-by field in the structure and the macro does an assignment of COUNT to a “throw away” location (the (&size_t){0}). If instead __builtin_counted_by_ref returns a value, it means there is a counted-by field in the structure and the macro does an assignment of COUNT to a such a field (the ret). This allows the macro to still be used unchanged even for structures which don’t use the counted_by attribute.

Thanks to this new builtin, the extra assignment after each macro use is not needed: everything is included in the macro, facilitating adoption of this new security feature.

Status of the new attribute and builtin

The work to support the new counted_by attribute and the new builtin __builtin_counted_by_ref has been committed to GCC15 and was released in April 2025.

The Linux kernel has been updated with this new protection to the flexible array members and also started to successfully catch buffer overflow flaws for FAMs that cannot be detected previously.

There is work being proposed to add allocation macros to the Linux kernel like the ones presented here. The amount of work for cleanups necessary before this macro can be introduced is quite significant and it will take some more time. In the end though it will render the Linux kernel code cleaner and safer. See this kernel mailing list thread for the macro proposal, and the use of the builtin function in this other thread. In addition to the Linux kernel, the counted_by attribute has been adopted by some userspace projects, like PHP, Bind, Varnish and others.

We encourage programmers to adapt their applications to leverage this new attribute to further protect applications from buffer overflows. Don’t wait to do so!

What’s next?

Now that flexible array members are properly protected, the next natural step is to extend the counted_by attribute to provide size information for the last class of dynamically-sized array whose size cannot be determined, i.e., pointers that point into arrays. This will be the topic our next blog in this series.

After that is completed, all the dynamically-sized arrays can be nicely protected.

Acknowledgments

Many people have provided helpful insight and feedback for this work, without them it could not have been done: Kees Cook, Siddhesh Poyarekar, Bill Wendling, Martin Uecker, Joseph Myers and Richard Biener.

Improving GCC Buffer Overflow Detection for C Flexible Array Members

Introduction

Flexible Array Members (FAMs)

Inform the compiler about the size of a FAM

Considerations on the use of the counted_by attribute to avoid pitfalls

A new builtin is needed: __builtin_counted_by_ref()

Status of the new attribute and builtin

What’s next?

Acknowledgments

Elena Zannoni

Qing Zhao

Overview of NVMe Architecture

Improving GCC Buffer Overflow Detection for C Flexible Array Members

Introduction

Flexible Array Members (FAMs)

Inform the compiler about the size of a FAM

Considerations on the use of the counted_by attribute to avoid pitfalls

A new builtin is needed: __builtin_counted_by_ref()

Status of the new attribute and builtin

What’s next?

Acknowledgments

Authors

Elena Zannoni

Qing Zhao

Overview of NVMe Architecture