Stop lights and cache misses

In my 8 years at Sun, the anticipation and excitement surrounding the T1 chip (code-named niagara) is unrivaled. Even better than hearing the official information over the last few years is hearing the anecdotal stories passed from one employee to another about actual performance results being achieved on internal tests and in customer sites. Everyone is doing a good job of keeping a lid on the details, but when Jonathan says we're heading into a significant period of leadership in performance, don't say nobody warned you.

For my value add, I'd like to create a car anology to help demystify the breakthrough performance and efficiency of the T1. I'm no chip guy, just a software geek, but I think I've got the big picture concepts.

In this analogy, a car traveling over a distance is equal to getting work done with a computer. In contemporary chips like the Xeon, engine speed has been optimized to create a really fast car. In this analogy, the Xeon car is always rev'd up to 3,000 rpm's (3000 Megahertz, or 3 Ghz). As in the real world, it takes significant juice to stay rev'd at 3000 rpm's, gas or watts its relatively similar. Like a car engine's rpm's, transistors have an efficiency curve. At lower megahertz there is less current leakage, just as there is a range in the rpm's where fuel is optimally converted to horsepower. The T1 engine is in a sweetspot at 1200 megahertz.

Now you might be thinking, "This sounds boring, is the T1 just a throttled engine that can't possibly compete with the race engine Xeon? Where is the fun in that?".

Here's the kicker where the T1 changes the game. Just as there are stop lights in the world of automobiles, so too in the CPU world. Those stop lights are called cache misses, where the CPU has to wait for data to be fetched from RAM. The Xeon sits at the stop light cache miss rev'd up at 3000 rpm waiting to jackrabbit off the line as soon as that data arrives (green light). The T1 plays by different rules. Whenever is comes to a stop light, it sees not one signal, but an array of 32. As long as 1 of the 32 stop lights is green, the T1 keeps on moving. And remember in this analogy, movement equals getting work done. It is because the T1 has 8 cores with 4 threads each that it always has 32 independent jobs looking for data. Therefore it rarely, if ever, encounters 32 red lights causing a full stop. There are typically dozens of green lights greeting it at every stop light.

For the official explanation of how the T1's 32 threads best a rev'd up Xeon, see the Throughput Whitepaper.

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

hoffie

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today