Optimization Shortcut with -fast

So if I've got a code and I've already compiled it without any real options, so I know it will compile, where do I start with trying to get the best performance?

Well, the Sun Studio compilers have many options for performance optimization. You can try them all one by one and see what works. 

Or, you can start off by compiling with -fast.

-fast is a macro -- it's a set of options that are all invoked simultaneously. Some of the options that it uses can be problematic for some codes. Also, compiling with -fast may increase compile time. But the resulting executable should run faster than compiling with default options for most codes.

Also, the set of options that make up -fast are different for each compiler and on whether you're compiling on a SPARC or x86/x64 processor.

One way to see what the component options of -fast are is by using the compiler's -dryrun or -# options

For example, on a SPARC Solaris system:

edgard:/home/rchrd<42>f95 -dryrun -fast | grep ###
###     command line files and options (expanded):
### -dryrun -xO5 -xarch=sparcvis2 -xcache=64/32/4:1024/64/4 -xchip=ultra3i -xpad=local -xvector=lib -dalign -fsimple=2 -fns=yes -ftrap=common -xlibmil -xlibmopt -fround=nearest

edgard:/home/rchrd<43>CC -dryrun -fast | grep ###
###     command line files and options (expanded):
### -dryrun -xO5 -xarch=sparcvis2 -xcache=64/32/4:1024/64/4 -xchip=ultra3i -xmemalign=8s -fsimple=2 -fns=yes -ftrap=%none -xlibmil -xlibmopt -xbuiltin=%all -D__MATHERR_ERRNO_DONTCARE


On my AMD64 OpenSolaris laptop we see:

FerrariOS:/export/home/rchrd<25>CC -dryrun -fast | grep ###
###     command line files and options (expanded):
### -dryrun -xO5 -xarch=sse3a -xcache=64/64/2:1024/64/16 -xchip=opteron -xdepend=yes -fsimple=2 -fns=yes -ftrap=%none -xlibmil -xlibmopt -xbuiltin=%all -D__MATHERR_ERRNO_DONTCARE -nofstore -xregs=frameptr -Qoption CC -iropt -Qoption CC -xcallee64

FerrariOS:/export/home/rchrd<22>cc -fast -# no.c |& grep ###
###     command line files and options (expanded):
### -D__MATHERR_ERRNO_DONTCARE -fns -nofstore -fsimple=2 -fsingle -xalias_level=basic -xarch=sse3a -xbuiltin=%all -xcache=64/64/2:1024/64/16 -xchip=opteron -xdepend -xlibmil -xlibmopt -xO5 -xregs=frameptr no.c



The particular options are chosen to get the best performance on the host platform ... so this assumes that you're going to run the executable binary on the same processor that compiled it.

I have one computationally intensive Fortran 95 program that runs on an UltraSPARC IIIi system in 54.4 seconds using just default compiler options. Just adding -fast to the compile command line gives me an executable that runs in only 12.2 seconds .. almost one-fifth the time.  The same program on my AMD64 laptop runs one-third as fast with -fast than without it.

But you do have to be careful. Check the manuals, which caution:

Because -fast invokes -dalign, -fns, -fsimple=2, programs compiled with -fast can result in nonstandard floating-point arithmetic, nonstandard alignment of data, and nonstandard ordering of expression evaluation. These selections might not be appropriate for most programs.

Looks like we we may have some more explaining to do.

 
  

Comments:

Post a Comment:
Comments are closed for this entry.
About


Deep thoughts on compiling C, C++, and Fortran codes with Oracle Solaris Studio compilers, especially optimization and parallelization, from the Solaris Studio documentation lead, Richard Friedman. Email him at
Richard dot Friedman at Oracle dot com

When Run Was A Compiler

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today