By Darryl Gove on Apr 12, 2007
The training workloads in CPU2006 are pretty good. Surprisingly, as pointed out in the paper, SPEC didn't have to change many of the workloads to make this happen. This supports the hypothesis that generally programs run through the same code paths regardless of the input.
What I like about this work is that it provides a way of assessing training workload quality. This directly addresses one of the concerns of some people about using profile feedback for optimisation: whether the selected training workload is going to be a bad choice for their actual workload.
In terms of the methodology, it's tempting to think that the best test would be to measure performance before and after. That particular approach was rejected because the performance gain is a function of both the quality of the training workload, and the ability of the compiler to exploit that workload (together with a fair mix of whether there is anything the compiler can do to improve performance even with the knowledge). So a performance gain doesn't necessarily mean a good quality training workload, and the absence of a gain doesn't necessarily mean a poor quality training workload.
The final metrics of code coverage and branch behaviour are about as hardware/compiler independent as possible. It should be possible to break down any program on any hardware to a very similar blocks even if the instructions in those blocks end up being different. So the approach seemed particularly good when evaluating cross platform benchmarks.
For those interested in using this method, there's a pretty detailed how-to guide on the developer portal.