By rchrd on Jan 08, 2009
So if I've got a code and I've already compiled it without any real options, so I know it will compile, where do I start with trying to get the best performance?
Well, the Sun Studio compilers have many options for performance optimization. You can try them all one by one and see what works.
Or, you can start off by compiling with -fast.
-fast is a macro -- it's a set of options that are all invoked simultaneously. Some of the options that it uses can be problematic for some codes. Also, compiling with -fast may increase compile time. But the resulting executable should run faster than compiling with default options for most codes.
Also, the set of options that make up -fast are different for each compiler and on whether you're compiling on a SPARC or x86/x64 processor.
One way to see what the component options of -fast are is by using the compiler's -dryrun or -# options
For example, on a SPARC Solaris system:
| edgard:/home/rchrd<42>f95 -dryrun -fast | grep ###
### command line files and options (expanded):
### -dryrun -xO5 -xarch=sparcvis2 -xcache=64/32/4:1024/64/4 -xchip=ultra3i -xpad=local -xvector=lib -dalign -fsimple=2 -fns=yes -ftrap=common -xlibmil -xlibmopt -fround=nearest
edgard:/home/rchrd<43>CC -dryrun -fast | grep ###
On my AMD64 OpenSolaris laptop we see:
FerrariOS:/export/home/rchrd<25>CC -dryrun -fast | grep ###
FerrariOS:/export/home/rchrd<22>cc -fast -# no.c |& grep ###
The particular options are chosen to get the best performance on the host platform ... so this assumes that you're going to run the executable binary on the same processor that compiled it.
I have one computationally intensive Fortran 95 program that runs on an UltraSPARC IIIi system in 54.4 seconds using just default compiler options. Just adding -fast to the compile command line gives me an executable that runs in only 12.2 seconds .. almost one-fifth the time. The same program on my AMD64 laptop runs one-third as fast with -fast than without it.
But you do have to be careful. Check the manuals, which caution:
Because -fast invokes -dalign, -fns, -fsimple=2, programs compiled with -fast can result in nonstandard floating-point arithmetic, nonstandard alignment of data, and nonstandard ordering of expression evaluation. These selections might not be appropriate for most programs.
Looks like we we may have some more explaining to do.