Revisiting the Intel 432

As I have discussed before, I strongly believe that to understand systems, you must understand their pathologies -- systems are most instructive when they fail. Unfortunately, we in computing systems do not have a strong history of studying pathology: despite the fact that failure in our domain can be every bit as expensive (if not more so) than in traditional engineering domains, our failures do not (usually) involve loss of life or physical property and there is thus little public demand for us to study them -- and a tremendous industrial bias for us to forget them as much and as quickly as possible. The result is that our many failures go largely unstudied -- and the rich veins of wisdom that these failures generate live on only in oral tradition passed down by the perps (occasionally) and the victims (more often).

A counterexample to this -- and one of my favorite systems papers of all time -- is Robert Colwell's brilliant Performance Effects of Architectural Complexity in the Intel 432. This paper, which dissects the abysmal performance of Intel's infamous 432, practically drips with wisdom, and is just as relevant today as it was when the paper was originally published nearly twenty years ago.

For those who have never heard of the Intel 432, it was a microprocessor conceived of in the mid-1970s to be the dawn of a new era in computing, incorporating many of the latest notions of the day. But despite its lofty ambitions, the 432 was an unmitigated disaster both from an engineering perspective (the performance was absolutely atrocious) and from a commercial perspective (it did not sell -- a fact presumably not unrelated to its terrible performance). To add insult to injury, the 432 became a sort of punching bag for researchers, becoming, as Colwell described, "the favorite target for whatever point a researcher wanted to make."

But as Colwell et al. reveal, the truth behind the 432 is a little more complicated than trendy ideas gone awry; the microprocessor suffered from not only untested ideas, but also terrible execution. For example, one of the core ideas of the 432 is that it was a capability-based system, implemented with a rich hardware-based object model. This model had many ramifications for the hardware, but it also introduced a dangerous dependency on software: the hardware was implicitly dependent on system software (namely, the compiler) for efficient management of protected object contexts ("environments" in 432 parlance). As it happened, the needed compiler work was not done, and the Ada compiler as delivered was pessimal: every function was implemented in its own environment, meaning that every function was in its own context, and that every function call was therefore a context switch!. As Colwell explains, this software failing was the greatest single inhibitor to performance, costing some 25-35 percent on the benchmarks that he examined.

If the story ended there, the tale of the 432 would be plenty instructive -- but the story takes another series of interesting twists: because the object model consumed a bunch of chip real estate (and presumably a proportional amount of brain power and department budget), other (more traditional) microprocessor features were either pruned or eliminated. The mortally wounded features included a data cache (!), an instruction cache (!!) and registers (!!!). Yes, you read correctly: this machine had no data cache, no instruction cache and no registers -- it was exclusively memory-memory. And if that weren't enough to assure awful performance: despite having 200 instructions (and about a zillion addressing modes), the 432 had no notion of immediate values other than 0 or 1. Stunningly, Intel designers believed that 0 and 1 "would cover nearly all the need for constants", a conclusion that Colwell (generously) describes as "almost certainly in error." The upshot of these decisions is that you have more code (because you have no immediates) accessing more memory (because you have no registers) that is dog-slow (because you have no data cache) that itself is not cached (because you have no instruction cache). Yee haw!

Colwell's work builds to crescendo as it methodically takes apart each of these architectural issues -- and then attempts to model what the microprocessor would look like were it properly implemented. The conclusion he comes to is the object model -- long thought to be the 432's singular flaw -- was only one part of a more complicated picture, and that its performance was "dominated, in large part, by artifacts and not by concepts." If there's one imperfection with Colwell's work, it's that he doesn't realize how convincingly he's made the case that these artifacts were induced by a rigid and foolish adherence to the concepts.

So what is the relevance of Colwell's paper now, 20 years later? One of the principal problems that Colwell describes is the disconnect between innovation at the hardware and software levels. This disconnect continues to be a theme, and can be seen in current controversies in networking (TOE or no?), in virtualization (just how much microprocessor support do we want/need -- and at what price?), and (most clearly, in my opinion) in hardware transactional memory. Indeed, like an apparition from beyond the grave, the Intel 432 story should serve as a chilling warning to those working on transactional memory today: like the 432 object model, hardware transactional memory requires both novel microprocessor architecture and significant new system software. And like the 432 object model, hardware transactional memory has been touted more for its putative programmer productivity than for its potential performance gains. This is not to say that hardware transactional memory is not an appropriate direction for a microprocessor, just that its advocates should not so stubbornly adhere to their novelty that they lose sight of the larger system. To me, that is the lesson of the Intel 432 -- and thanks to Colwell's work, that lesson is available to all who wish to learn it.

Comments:

Bryan, is Rock really in that much trouble?

Posted by Dimitar on July 18, 2008 at 10:05 AM PDT #

Fascinating stuff. Computer history in general is not recorded, which is a shame. There are a few populist books around, like "Soul of a New Machine" and "Showstopper!" that are entertaining but hardly insightful. Contemporary magazines describe the new hotness in great detail but any historical examination is typically a cursory adjunct.

I've been somewhat encouraged by the appearance of books that describe the internal design of operating systems (like Windows Internals and Solaris Internals) in a way that is accessible without sacrificing necessary detail. These fascinating books offer a real way for engineers to learn how other engineers have tackled some very complex system problems without having to grok an entire code base.

Posted by Andrew on July 18, 2008 at 10:24 AM PDT #

Hi Bryan. I can say that I was one of the folks working on the iAPX432 project (on systems-level design). The real problems with the 432 project was that it in production at a time when Intel was only beginning to ship the 80286! Can you see the disconnect here? Yes, the design team made some unfortunate choices, but they had not even begun to design the workhorse processors that would become the foundation for our current CISC/RISC processors, and even those processors (80386, Pentium, et al) are considered relics now. More than just being ahead of its time, even the semiconductor processes needed to make such a design plausible did not yet exist. The 'commercial' product which rolled over the 432, the 80286, did not have any cache either. And so on. While you can fault the project and team for trying to put jet engines on a primitive biplane, the 432 endeavor has directly influenced the design of modern microprocessors, not to mention adding further legitimacy to the OO software concept - another idea that was considered ill-conceived at that time.
I am writing to say that taking potshots at the 432 is in itself ill-considered, except in a superficial sense. In reference to your main point, I would point out that the 432 was an OO design through and through. All code was executed on behalf of objects, and all objects ran somewhat asynchronously. Interesting, one of the hottest topics in OO design today. Transactional memory was the only way of dealing with such a paradigm at the time. The real question is "what do you hope to accomplish?" I will agree that there is a growing discontinuity between processor platforms/capabilities and the software developed on top of such hardware implementations. In my mind, the relevant question is one of whether we will continue as in the past, with various approaches in the microprocessor realm doomed to failure as they find little to no software embracing those features. And conversely, software continuing to gain what could be key hardware features for functionality and performance due to intransigence on the part of commercial software and hardware interests?

Posted by Michael on July 20, 2008 at 07:25 AM PDT #

Michael, Interesting to know that this is still a hot topic over 20 years after the fact! As for what I hope to accomplish, I just want to bring attention to what I believe is a superlative systems paper. And indeed, the thoughts on the 432 here are largely just summarizing Colwell's paper; do you disagree with its findings as well? I don't think Colwell's work or my summary are taking "potshots" at the 432, but rather trying to better understand its performance and learn from its design decisions -- part of the reason that we tend to repeat mistakes in our industry is that we don't stop frequently enough to look back at the ramifications of our past decisions...

Posted by Bryan Cantrill on July 20, 2008 at 11:58 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

bmc

Search

Top Tags
Categories
Archives
« July 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today