By walterbays on Aug 23, 2006
Performance engineers are sometimes accused of being obsessed with performance, to the exclusion of common sense. We tweak and tune systems for the maximum possible throughput, and frankly some ordinary users are left behind - people who don't really care how fast it is so long as it's fast enough - people who have their own jobs to do, and for whom the computer is just another tool.
Verlag Heinz Heise publishing (iX Magazin, Germany) has long offered constructive criticism to SPEC like this article on the release of the CPU2000 benchmark suite. Not ones to just heckle the teams from the sideline, iX runs their own benchmark tests publishing them in their own magazine and web site and also at SPEC. This lets them show performance that they consider real world, tuned according to good application development practice but short of what they consider heroic benchmark tuning. For example in 2001 they published this result of 88.3 SPECint_rate_base2000 on our 24 chip 750 MHz Sun Fire 6800. We tuned the same system to achieve 96.1 SPECint_rate_base2000 and 101 SPECint_rate2000 (peak).
So we could show the user how to get 14% more performance from the system. However today you can get 78.8 SPECint_rate_base2000 and 89.8 SPECint_rate2000from a 4 chip 3 GHz Sun Fire V40z. So who's to blame the user if instead of investing much effort in performance tuning he would rather just ride Moore's Law for a while?
It's relatively easy for an independent party like Verlag Heinz Heise to say what they mean by real world optimization, compared to the difficulty of the members of SPEC agreeing on what we mean by peak versus baseline optimization. As we described the baseline metric, it "...uses performance compiler flags that a compiler vendor would suggest for a given program knowing only its own language." (See CPU2000 FAQ.) But of course degree of optimization is a continuum, and there will be application developers who do not optimize even as much as SPEC base and there will be those who optimize more than SPEC peak.
One benefit of SPEC's full disclosure reports is that they can serve as examples of good tuning practice to supplement system documentation. SPEC benchmarks may not use tuning that is not documented, supported, and recommended for general use. So a visit to the flags disclosure pages may provide a wealth of good tuning ideas for your own applications.
Tuning for system level benchmarks may be even more complex than compiler options for the CPU benchmarks. For example, in this SPECjAppServer2004 result there are tuning notes for the emulator software, the database software, the driver software, and the operating systems, as well as for the J2EE application server:
-XX:+UseParallelGC -XX:ParallelGCThreads=32 -XX:PermSize=128m
-XX:MaxTenuringThreshold=3 -XX:LargePageSizeInBytes=4m -XX:SurvivorRatio=20
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:-TraceClassUnloading
Java process started in FX class using /usr/bin/priocntl -e -c FX
Again the SPEC full disclosure reports may guide the reader in discovering and using various system tuning parameters. Of course with system tuning you'll probably also want to take a course and/or get a good book like Solaris Internals.
For the user without time or inclination to learn system performance tuning, who wants his system to be just fast enough while he waits for Moore's Law to bring next year's more powerful server, I'm afraid the computer industry just hasn't done enough. Directions are positive though. Suppose you discover, for instance, that it's good for your C application to align branch targets based on execution frequency. A Java application may have such an optimization applied automatically by a Java Virtual Machine with dynamic optimization such as HotSpot. Operating systems may with your permission automatically apply system updates including performance updates, such as Microsoft Update (Sorry, the corresponding URL on microsoft.com only works with Explorer.) and Sun Update Connection.
Computer, tune thyself.