I recently read a couple of posts about SPEC CPU2006. As you can tell from the papers linked on this blog, I was quite busy helping prepare the suite - which was considerable fun. The first post is by Tom Yager, where he praises the suite for raising awareness of the components of a system that actually contribute to performance: "I added a practical angle to my scientific understanding of compiler optimizations, processor scheduling, CPU cache utilization".
On the other hand Neil Gunther (second time I've mentioned him) condemns "bogus SPECxx_rate benchmarks which simply run multiple instances of a single-threaded benchmark". I hope he's joking, but taking his comments at face value...
Interestingly he suggests SPEC SDM as a good choice. I'd not heard of this suite, but reading up on it it looks like it tests the impact of multiple users typing and executing commands on the system at the same time, and it's not been touched in over 10 years. I guess SDM would be a good match for the SunRay that I use daily, but I'm certain that the suite doesn't include the 22 copies of firefox that I currently see running on the server I'm using. On only slightly less rocky ground he talks about TPC-C, which appeared in 1992!
CPU2006 represents the CPU intensive portion very well, but deliberately tries of avoid hitting the disk or network. Since disk and network do play significant roles in system performance, I probably wouldn't recommend getting a machine purely on its SPECcpu2006 or SPECcpu2006_rate scores. However, the mix of apps in the benchmark suite is representative of most of the codes that are out there (and I believe some are codes that appeared less than 10 years ago. So what ever app is being run on a system, there is probably a code in CPU2006 which is not that dissimilar.
Tackling his core beef with the suite that the rate metric "simply" runs multiple copies of the the same ecode, this is actually harder work for the system than running a heterogeneous mix of applications. So I'd suggest that it is a better test of system capability than running some codes that stress memory bandwidth together with some other codes that are resident in the L1 cache. So IMO far from being "bogus", specrate is a very good indicator of potential system throughput.