Wednesday Dec 15, 2010

Oracle Solaris Studio C/C++ : Inline Functions

Function inlining improves runtime performance by replacing a call to a function with the body of the function. This eliminates the overhead of jumping to and returning from a subroutine. An additional advantage is that placing the function code "inline" exposes it to further optimization.

The C++ compiler performs two kinds of inlining: front-end (parser) and back-end (code generator). The C and Fortran compilers support only back-end inlining. The same code generator is used for all compilers on a platform. The C++ compiler performs front-end inlining because it can use its knowledge of C++ semantics to eliminate extra copies of objects among other things that the code generator would not be able to do. The back-end inlining does not depend on the programming language.

The C++ compiler front end attempts to inline a function declared inline. If the function is too large, a warning message will be printed on stdout when +w or +w2 ("more warnings") option is used. The +d option prevents the front end from attempting to inline any function. The -g option also turns off front-end inlining. The -O options do not affect front-end inlining. C++ front end turns off function inlining when a combination of -g and -O options are specified on compile line. It may lead to loss of runtime performance. To avoid this, it is suggested to use -g0 instead of -g. C does not have -g0 so use -g instead.

With an optimization level of -O4 or higher, the code generator examines all functions independent of how they were declared in source code and replace function calls with inline code where it thinks the replacement will be beneficial. No diagnostic messages are displayed about back-end inlining. The +d option has no impact on back-end inlining.

Couple of trivial examples to demonstrate the compiler behavior.

eg.,

% cat inline.c

#include <stdio.h>

inline void printmespam() {
        printf("print me"); printf("print me"); 
        printf("print me"); printf("print me");
        printf("print me"); printf("print me"); 
        printf("print me"); printf("print me");
        printf("print me"); printf("print me"); 
        printf("print me"); printf("print me");
        printf("print me"); printf("print me"); 
        printf("print me"); printf("print me");
        printf("print me");
}

inline void printme() {
        printf("print me");
}

int main() {
        printme();
        printmespam();
        return (0);
}

% CC +w2 inline.c
"inline.c", line 17: Warning: "printmespam()" is too large and will not be expanded inline.
1 Warning(s) detected.

In the above example, printmespam() was not inlined by the compiler though it was explicitly requested to do so. The keyword inline is only a request but not a guarantee.

How to check if a routine is inlined?

A: Check the symbol table of the executable. If the routine doesn't show up in the symbol table, it is an indication that the missing routine is inlined. This is because the compiler might have replaced the function call with the body of the function.

% elfdump -CsN.symtab a.out | grep printme
      [85]  0x00010e68 0x000000a4  FUNC GLOB  D    0 .text       void printmespam()

printme is inlined where as printmespam is not.

Another way is to check the assembly code being generated. To generate the assembly code, compile the code with -S option of Oracle Solaris Studio compilers.

eg.,

% cat swap.c

void swap (int \*a, int \*b) {
        int t = \*a;
        \*a = \*b;
        \*b = t;
}

int main (int argc, char \*argv) {
        int x = 5, y = 2;
        swap (&x,&y);
        return (0);
}

% CC +w2 -S swap.c

% grep call swap.s
        call    __1cEswap6Fpi0_v_

% dem __1cEswap6Fpi0_v_
__1cEswap6Fpi0_v_ == void swap(int\*,int\*)

From the above output(s), it is clear that the function is not inlined since an assembly instruction has been generated with a call to routine swap. Let's add the keyword inline to the function definition.

% cat swap.c

inline void swap (int \*a, int \*b) {
        int t = \*a;
        \*a = \*b;
        \*b = t;
}

int main (..) { .. }

% CC +w2 -S swap.c
% grep call swap.s
%

After instructing the compiler to inline the routine swap, the compiler was able to inline the function in main() mainly because it was not too big. That is why no assembly instruction has been generated with a call to swap.

Another example to demonstrate slightly different behavior.

eg.,

% cat inline2.c

#include <stdio.h>

int globvar = 0;

inline void setglob () {
        globvar= 25;
}

int main (int argc, char \*argv[]) {
        globvar= 5;
        setglob ();
        printf ("Now global variable holds %d\\n", globvar);
        return (0);
}

% cc -o test inline2.c
Undefined                       first referenced
 symbol                             in file
setglob                             inline2.o
ld: fatal: Symbol referencing errors. No output written to test

The above code violates a C rule. An inline definition without an extern directive does not create an instance of the function. Calling the function has undefined results. The fix is to declare setglob with external linkage as shown below.

C++ has a different rule for inline functions. The compiler is required to figure out how to generate a defining instance if one is needed without any special action by the programmer. So the above example has valid C++ code but it is valid in C.

 
% cat inline2.c

#include <stdio.h>

int globvar = 0;

extern inline void setglob () {
        globvar= 25;
}

int main (..) { .. }

% cc inline2.c
% ./a.out
Now global variable holds 25

Notes:

  1. Do not use if(0) in an inline function. Use #if 0 instead

  2. Do not put a return statement in the "then" part of an "if" statement. Rearrange the code to put the return in the "else" part or outside the if-else entirely.

Acknowledgements:
Steve Clamage

(Location of original blogpost:
http://technopark02.blogspot.com/2005/04/sun-cc-compilers-inlining-routines.html)

Sunday Apr 15, 2007

C/C++: Printing Stack Trace with printstack() on Solaris

On Solaris 9 and later, libc provides an useful function called printstack, to print a symbolic stack trace to the specified file descriptor. This is useful for reporting errors from an application during run-time.

If the stack trace appears corrupted, or if the stack cannot be read, printstack() returns -1.

Here is a programmatic example:

% more printstack.c

#include <stdio.h>
#include <ucontext.h>

int callee (int file)
{
    printstack(file);
    return (0);
}

int caller()
{
    int a;
    a = callee (fileno(stdout));
    return (a);
}

int main()
{
    caller();
    return (0);
}

% cc -V
cc: Sun C 5.8 Patch 121016-04 2006/10/18 <- Sun Studio 12 C compiler
usage: cc [ options] files.  Use 'cc -flags' for details

% cc -o stacktrace stacktrace.c

% ./stacktrace
/tmp/stacktrace:callee+0x18
/tmp/stacktrace:caller+0x22
/tmp/stacktrace:main+0x14
/tmp/stacktrace:0x6d2

The printstack() function uses dladdr1() to obtain symbolic symbol names. As a result, only global symbols are reported as symbol names by printstack().

The stack trace from a C++ program, will have all the symbols in their mangled form. So as of now, the programmers may need to have their own wrapper functions to print the stack trace in demangled form.

% CC -V
CC: Sun C++ 5.8 Patch 121018-07 2006/11/01 <- Sun Studio 12 C++ compiler

% CC -o stacktrace stacktrace.c

% ./stacktrace
/tmp/stacktrace:__1cGcallee6Fi_i_+0x18
/tmp/stacktrace:__1cGcaller6F_i_+0x22
/tmp/stacktrace:main+0x14
/tmp/stacktrace:0x91a

There has been an RFE (Request For Enhancement), 6182270 printstack should provide demangled names while using with c++, in place against Solaris' libc to print the stack trace in demangled form, when printstack() has been called from a C++ program. This will be released as a libc patch for Solaris 9 & 10, as soon as it gets enough attention.

% /usr/ccs/bin/elfdump libc.so.1  | grep printstack
     [677]  0x00069f70 0x0000004c  FUNC WEAK  D    9 .text          printstack
    [1425]  0x00069f70 0x0000004c  FUNC GLOB  D   36 .text          _printstack
     ...

Since the object code is automatically linked with libc during the creation of an executable or a dynamic library, the programmer need not specify -lc on the compile line.

Suggested Reading:
Man page of walkcontext / printstack

(Original blog post is at:
http://technopark02.blogspot.com/2005/05/cc-printing-stack-trace-with.html)

About

Benchmark announcements, HOW-TOs, Tips and Troubleshooting

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today