Thursday Mar 14, 2013

The pains of preprocessing

Ok, so I've encountered this twice in 24 hours. So it's probably worth talking about it.

The preprocessor does a simple text substitution as it works its way through your source files. Sometimes this has "unanticipated" side-effects. When this happens, you'll normally get a "hey, this makes no sense at all" error from the compiler. Here's an example:

$ more c.c
#include <ucontext.h>
#include <stdio.h>

int main()
{
  int FS;
  FS=0;
  printf("FS=%i",FS);
}

$ CC c.c
$ CC c.c
"c.c", line 6: Error: Badly formed expression.
"c.c", line 7: Error: The left operand must be an lvalue.
2 Error(s) detected.

A similar thing happens with g++:

$  /pkg/gnu/bin/g++ c.c
c.c: In function 'int main()':
c.c:6:7: error: expected unqualified-id before numeric constant
c.c:7:6: error: lvalue required as left operand of assignment

The Studio C compiler gives a bit more of a clue what is going on. But it's not something you can rely on:

$ cc c.c
"c.c", line 6: syntax error before or at: 1
"c.c", line 7: left operand must be modifiable lvalue: op "="

As you can guess the issue is that FS gets substituted. We can find out what happens by examining the preprocessed source:

$ CC -P c.c
$ tail c.i
int main ( )
{
int 1 ;
1 = 0 ;
printf ( "FS=%i" , 1 ) ;
}

You can confirm this using -xdumpmacros to dump out the macros as they are defined. You can combine this with -H to see which header files are included:

$ CC -xdumpmacros c.c 2>&1 |grep FS
#define _SYS_ISA_DEFS_H
#define _FILE_OFFSET_BITS 32
#define REG_FSBASE 26
#define REG_FS 22
#define FS 1
....

If you're using gcc you should use the -E option to get preprocessed source, and the -dD option to get definitions of macros and the include files.

Thursday Apr 28, 2011

Catching the macro bug

I have to admit a dislike for macros. I've seen plenty of codes where it has been a Herculean task to figure out exactly what source code generated the particular assembly code. So perhaps I'm biased to begin with. However, I recently hit another annoyance with macros. The following code looks pretty benign:

#include <stdio.h>
#include <sys/time.h>

int timercmp(struct timeval \*end, struct timeval \*begin,struct timeval \*result)
{
  printf("TIMERCMP");
}

However, at compile time it produces the following error.

cc error.c
"error.c", line 4: syntax error before or at: struct
"error.c", line 4: syntax error before or at: )
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_sec
"error.c", line 4: syntax error before or at: )
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_sec
"error.c", line 4: syntax error before or at: )
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_usec
"error.c", line 4: syntax error before or at: ->
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_usec
"error.c", line 4: syntax error before or at: )
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_sec
"error.c", line 4: identifier redefined: result
        current : function(pointer to struct timeval {long tv_sec, long tv_usec}) returning pointer to struct timeval {long tv_sec, long tv_usec}
        previous: function(pointer to struct timeval {long tv_sec, long tv_usec}) returning pointer to struct timeval {long tv_sec, long tv_usec} : "error.c", line 4
"error.c", line 4: syntax error before or at: ->
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_sec
cc: acomp failed for error.c

The C++ compiler produces fewer errors:

 CC error.c
"error.c", line 4: Error: No direct declarator preceding "(".
1 Error(s) detected.

Of course, the problem is that timercmp is a macro defined in sys/time.h. This is revealed when the preprocessed source is examined:

$ cc -P error.c
$ tail error.i

int  ( ( ( struct timeval \* end ) -> tv_sec == ( struct timeval \* begin ) -> tv_sec ) ? ( ( struct timeval \* end ) -> tv_usec struct timeval \* result ( struct timeval \* begin ) -> tv_usec ) : ( ( struct timeval \* end ) -> tv_sec struct timeval \* result ( struct timeval \* begin ) -> tv_sec ) )
{
  printf("TIMERSUB");
}

Now, we can narrow the problem down more rapidly by trying to compile the preprocessed code. This takes us to the exact line with the problem, and it's obvious from inspection exactly what is going on:

$ cc error.i
"error.i", line 1135: syntax error before or at: struct
"error.i", line 1135: syntax error before or at: )
About

Darryl Gove is a senior engineer in the Solaris Studio team, working on optimising applications and benchmarks for current and future processors. He is also the author of the books:
Multicore Application Programming
Solaris Application Programming
The Developer's Edge
Free Download

Search

Categories
Archives
« July 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today
Bookmarks
The Developer's Edge
Solaris Application Programming
Publications
Webcasts
Presentations
OpenSPARC Book
Multicore Application Programming
Docs