X
  • Work
    June 23, 2006

Performance tuning suggestions

Guest Author

I have an article on selecting the best compiler options. However it's probably a good thing to quick run through some rules of thumb.

If you have machine time and are willing to let tools select the best set of compiler options, then the tool to try is ats. This uses a set of heuristics to identify the set of compiler options that give best performance (or code size, or other metrics as required).

Assuming that you are interested in getting the best performance out of an application, then the following steps might be helpful. I'm not going to talk through a complete set of compiler options (for example it omits using profile feedback), but hopefully this will be sufficient to get started.

  • Run with -O. This gives a good feel for baseline performance. It goes without saying (so I'll say it anyway) that you should use -g (for C/Fortran) and -g0 (for C++) to enable debug information.
  • Compile with -fast -xipo=2 -xpagesize=4M. This should give better performance. If there's not much difference, then perhaps your application is one that is not amenable to optimisation.
  • Gather both the -O and -fast -xipo=2 -xpagesize=4M profiles, and check for where the hotspots are.

Assuming that there is a difference in performance between the two set of options, then examining the performance of the following builds will indicate which options to use to get those gains. It is best to use only the flags that you have identified as being important for performance, and it is equally important that you know that each of the flags is safe for your application









FlagsComment
-xO5 Get baseline at higher optimisation level
-xO5 -xpagesize=4M Check whether the issue is TLB misses
-xO5 -xdepend Enable dependancy analysis, which should lead to better loop scheduling
-xO5 -xdepend -fsimple=2 -fns -xlibmil -lmopt Enable floating point optimisation (only use if the program does not contain code crafted to adhere to the floating point maths specification IEEE-754)
-xO5 -xipo=2 determine whether the code is sensitive to inlining

The strongest recommendation is to always profile your code. spot is one way of doing this. It's helpful to look at spot output in the light of some of the suggestions here on using the performance counters to guide performance tuning.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.