Don't Become Moore Confused (Or, The Death of the Microprocessor, not Moore's Law)

It was great to see that Gordon Moore got to deliver his “40 years later” talk at the Computer History Museum. I hope, though I know in vain, that at last everyone now understands what Moore's Law actually predicts --- and more importantly, what it doesn't. It is a prediction about the doubling of the number of transistors on an integrated circuit, about every 24 months.

It isn't a prediction about the speed of computers.

It isn't a prediction about their architecture.

It isn't a prediction about their size.

It isn't a prediction about their cost.

It is a prediction about the number of transistors on a chip. Full stop. That's it.

Let's take this one at a time. But, first, a little math for the exponentially challenged. In 40 years there are 20 24-month periods. 2\^20 is about one million. A bit of revisionism calls the doubling time 18 months. In that scenario, there are about 26.6 doubling times, or a factor of about 100 million. Let's just split the difference (logarithmically) and say that we've got about 10 million transistors on a chip today for every one we had 40 years ago. In any case, the biggest chips we build today are about 500 million transistors.

Okay, what about speed? A fundamental misconception is that Moore's Law predicts that computer speed will double every 18 to 24 months. Worse, since a very large West coast semiconductor company decided to market the equation that clock speed = performance, I can't tell you how many times I have had the question “With all of the power and heat problems with microprocessors, it looks like clock rates have maxed out or (gasp) have actually slowed down. Are we seeing the end of Moore's Law?” I used to scream. Now I just sigh.

No. Gordon said nuttin 'bout clock rates. And a little data shows how ridiculous that would be. The IBM 360 System 91 (vintage 1967) had a clock rate of 10 Mhz. Ten million times that would mean that today's microprocessors should clock at 100 Thz, or about 10,000 times faster than the fastest clocked chips today.

Size? Well, this is fuzzy enough to say “yes and no.” At the computer level, the answer is firmly “no.” An average industrial refrigerator-sized computer of the late sixties was under 10 cubic meters. Today's 1-2RU rack-mounted server is in the neighborhood of 0.01 cubic meters. That's only a factor of a 1000. Now at the processor level, it depends upon what kind of packaging you consider. Looking at bare dice, you can get close to a factor of 10 million, but this kind of analysis is more about the number of transistors on a chip --- which is Moore's Law.

Cost? Certainly not. A usable server is about $1000 today. Even with generous inflation adjustment, this still translates to a $1B (1970), which is ridiculous. Before you fire off a flame to me about $1.50 microcontrollers and five-cent RFID tags, I'll point out that there were plenty of low cost computers in the late sixties and early seventies. Think of PDP-8's. And remember the first calculators in the mid-seventies? HP and TI had low-cost, programmable, battery-powered (coat)pocket-sized offerings for only a few hundred dollars.

Moore's Law is about transistors. I can print a chip today that has almost a billion transistors on it. Let's look at that more closely. Our first version of SPARC was constructed from about a 100,000 transistor ASIC. So today we could fit TEN THOUSAND of our original SPARC microprocessors on a single chip. That, gentle readers, is interesting.

It's interesting because, today at least, we don't put 10,000 processors on a single chip. What we did do with the gift of transistor doubling was to continuously reinvest them into to building bigger and badder (and more power-hungry) single processors. Not a lot of architectural innovation I might add. We basically were taking many of the ideas pioneered in the sixties (mostly invented at IBM; see the ACS, Stretch, and 360/91 and compare them with “modern” microprocessors) and re-implemented them on single pieces of silicon.

We took discrete processors and integrated them into microprocessors. The serious displacement of discrete systems (bipolar TTL and ECL) started in 1985 and by 1990 it was all over. The discrete processor-based systems, from minicomputers to mainframes to supercomputers, all died, being replaced by microprocessor-based designs.

Now here we are 20 years later. We have squeezed all of the blood from that stone. We're done. Actually, we over-did it. Continuing to throw transistors at making single processors run faster is a bad idea. It's kinda like building bigger and bigger SUVs in order to solve our transportation problems. As I said, a bad idea.

A direct consequence of pursuing this bad idea is that, like gigantic SUVs, the energy efficiency of our biggest and fastest microprocessors is horrible. Meaning, we get very poor computing return for every watt we invest. Outside of portable applications, this extreme energy wasting has really only become a concern when the industry realized that it was getting very difficult to remove the waste heat --- to cool the engine, as it were.

(Another consequence is that these complex microprocessors are, well, complex. That means more engineers to create the design, more engineers to test that the design is correct, and whole new layers of managers to try to coordinate the resulting hundreds and hundreds of folks on the project. Bugs increase, schedules are missed, and innovation actually decreases.)

The result: microprocessors are dead.

Just as the '80's discrete processors were killed by microprocessors, today's discrete systems --- motherboards full of supporting chip sets and PCI slots with sockets for microprocessors --- will be killed by microsystems: my word for the just-starting revolution of server-on-a-chip. What's that? Pretty much what it sounds like. Almost the entire server (sans DRAM) is reduced to a single chip (or a small number of co-designed ones, just as the first micros often had an outboard MMU and/or FPU). These microsystems directly connect to DRAM and to very high speed serialized I/O that are converted to either packet or coherent-memory style network connections.

Open up the lid of a microsystem and you'll find a full SMP : multiple processor cores, crossbar switches, multi-level caches, DRAM and I/O controllers. Our Niagara chip, for example, has eight cores (each four-way threaded), a caching crossbar switch, four memory controllers, and a high speed I/O channel. And its performance is very competitive with our original E10K, the 64 processor behemoth that stormed the world as the first high-volume, enterprise class, massive multiprocessor.

Moore's Law is VERY much alive. And as Marc Tremblay puts it, with Niagara, it's as if we have leapt forward one, or even two, generations of integration.

The secret was to turn the clock back --- figuratively and literally --- to earlier, more sane processor pipeline designs. Ones that were more conservative of transistors, area, power, and complexity. (A key innovation, however, was to finally fold multithreading into the individual pipes). With these smaller, leaner and far more power-efficient processor cores, we could then use the transistor count advance of Moore's Law to paste down many of them on the same die, and to integrate the rest of the SMP guts at the same time.

The result is a chip that is incredibly hot performance-wise, and way cool thermally speaking. Truly an awesome accomplishment.

Incidentally, Opteron is a microsystem, too. You can get a couple of cores, integrated memory controller, and a set of smartly architected serial network ports (hypertransport) that bridge to I/O systems, or directly connect to other Opterons. Our good friends at AMD are actively killing the microprocessor with Opteron. From our vantage, they are still leaving a lot of potential performance on the table (and power efficiency as well) by not reducing core complexity and adding aggressive multithreading. That being said, Opteron is seriously spanking Xeon with the lower memory latency benefit of on-chip DRAM controllers.

Where does end up? Well, we are now dying to get to 65nm (Niagara is 90nm) so we can get even more transistors on a chip in order to integrate more and bigger systems. Just as the microprocessor, harvested the pipeline inventions of 60's and 70's, microsystems are going to integrate the system innovations of the 80's and 90's.

By 2010 microprocessors will seem like really old ideas. Motherboards will end up in museum collections. And the whole ecology that we have around so-called industry standard systems will collapse as it becomes increasingly obvious that the only place that computer design actually happens is by those who are designing chips. Everything downstream is just sheet metal. The apparent diversity of computer manufactures is a shattered illusion. In 2010, if you can't craft silicon, you can't add value to computer systems. You'd be about as innovative as a company in the 90's who couldn't design a printed circuit board.

Thanks, Gordon.


Post a Comment:
Comments are closed for this entry.



« April 2014