Alan Hargreaves's Weblog

Why I hate macros that make pointer dereferences look like structure elements

Alan Hargreaves
Senior Principal Technical Support Engineer

I have a colleague who generated an IDR patch for tcp in Solaris 10 for me to give relief to a customer for a bug while the formal fix was in progress.

As a part of the fix we had this code fragment

 18984          /\*
18985 \* If the SACK option is set, delete the entire list of
18986 \* notsack'ed blocks.
18987 \*/
18988 if (tcp->tcp_sack_info != NULL) {
18989 if (tcp->tcp_notsack_list != NULL)
18990 TCP_NOTSACK_REMOVE_ALL(tcp->tcp_notsack_list, tcp);
18991 }

replaced with this code fragment (the fix actually has a lot more to it than this, but this is what was relevent.)

 18936          /\*
18937 \* If the SACK option is set, delete the entire list of
18938 \* notsack'ed blocks.
18939 \*/
18941 if (tcp->tcp_notsack_list != NULL)
18942 TCP_NOTSACK_REMOVE_ALL(tcp->tcp_notsack_list, tcp);

Now, the assembly around here reads

ip:tcp_process_shrunk_swnd+0x38:       ldx      [%i0 + 0xf8], %g1
ip:tcp_process_shrunk_swnd+0x3c: add %g3, %i1, %g2
ip:tcp_process_shrunk_swnd+0x40: stw %g2, [%i0 + 0x80]
ip:tcp_process_shrunk_swnd+0x44: ldx [%g1 + 0x48], %i5

and the register dump

pc:  0x7bed2918 ip:tcp_process_shrunk_swnd+0x44:   ldx  [%g1 + 0x48], %i5
npc: 0x7bed291c ip:tcp_process_shrunk_swnd+0x48: subcc %i5, 0x0, %g0 ( cmp %i5, 0x0 )
global: %g1 0
%g2 0x761c %g3 0x68ec
%g4 0x600216f6e6c %g5 0
%g6 0x1c %g7 0x2a101f89ca0
out: %o0 0x600210d8640 %o1 0x1e0
%o2 0x5a8 %o3 0x3c8
%o4 0x600216f6e6c %o5 0x600216f68c4
%sp 0x2a101f88c61 %o7 0x7bed2900
loc: %l0 0xc7341c85 %l1 0x2000
%l2 0x60010972000 %l3 0x600210d8640
%l4 0x1000 %l5 0x1000
%l6 0x1000 %l7 0x5
in: %i0 0x600210d8640 %i1 0xd30
%i2 0 %i3 0xc73439b5
%i4 0 %i5 0x4
%fp 0x2a101f88d11 %i7 0x7becbed4

The last instruction is where we paniced (yes the customer paniced [twice] as a result of this). As we can see from the
register dump, %g1 is NULL, so we definitely have a NULL pointer
dereference going on.

So where did this come from? It looks like a dereference of 0xf8
from %i0. %i0 is a (tcp_t \*) making %g1 a (tcp_sack_info \*), namely
arg0->tcp_sack_info if we look at the structure; but hang on, the code
says tcp->tcp_notsack_list, not tcp->tcp_sack_info. Indeed that element
name does not exist within a tcp_t.

A light dawns when we see that:

   299  #define tcp_notsack_list


So in reality line 18941 is doing:

 18941          if (tcp->tcp_sack_info->tcp_notsack_list != NULL)

Without checking whether or not tcp->tcp_sack_info is non-NULL. The correct line should perhaps read

 18941          if (tcp->tcp_sack_info != NULL && tcp->tcp_notsack_list != NULL)

Now this would probably not have made it as far as in IDR patch delivered to a customer, if we didn't have that macro definition because alarm bells would have rung that we were doing another dereference!

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.