Stop lights and cache misses
By hoffie on Nov 14, 2005
For my value add, I'd like to create a car anology to help demystify the breakthrough performance and efficiency of the T1. I'm no chip guy, just a software geek, but I think I've got the big picture concepts.
In this analogy, a car traveling over a distance is equal to getting work done with a computer. In contemporary chips like the Xeon, engine speed has been optimized to create a really fast car. In this analogy, the Xeon car is always rev'd up to 3,000 rpm's (3000 Megahertz, or 3 Ghz). As in the real world, it takes significant juice to stay rev'd at 3000 rpm's, gas or watts its relatively similar. Like a car engine's rpm's, transistors have an efficiency curve. At lower megahertz there is less current leakage, just as there is a range in the rpm's where fuel is optimally converted to horsepower. The T1 engine is in a sweetspot at 1200 megahertz.
Now you might be thinking, "This sounds boring, is the T1 just a throttled engine that can't possibly compete with the race engine Xeon? Where is the fun in that?".
Here's the kicker where the T1 changes the game. Just as there are stop lights in the world of automobiles, so too in the CPU world. Those stop lights are called cache misses, where the CPU has to wait for data to be fetched from RAM. The Xeon sits at the stop light cache miss rev'd up at 3000 rpm waiting to jackrabbit off the line as soon as that data arrives (green light). The T1 plays by different rules. Whenever is comes to a stop light, it sees not one signal, but an array of 32. As long as 1 of the 32 stop lights is green, the T1 keeps on moving. And remember in this analogy, movement equals getting work done. It is because the T1 has 8 cores with 4 threads each that it always has 32 independent jobs looking for data. Therefore it rarely, if ever, encounters 32 red lights causing a full stop. There are typically dozens of green lights greeting it at every stop light.
For the official explanation of how the T1's 32 threads best a rev'd up Xeon, see the Throughput Whitepaper.