UltraSPARC CoolThreads Server parts count
By relling on Dec 12, 2005
Previously, I mentioned how parts count affects reliability. I was happy when David Yen used one of the slides we developed in the Sun FireTM CoolThreads Server launch last week. It is always good to see when your work gets widespread exposure.
When we do a parts count, it is quite simple for our own products. Obviously, when you do the design, it is just a simple matter of having the CAD tools print the component parts used for you as a bill of materials (BOM). This information gets put into various databases and are used for procurement, manufacturing, and service throughout the life cycle of the product. In the RAS Engineering (Reliability, Availability, and Serviceability) group we use this data for making reliability projections. Early in the design cycle, we might make many reliability projections as the design trade-offs are being made. We also use the reliability projections for more complex RAS models and benchmarks which are used to improve our designs and compare to other designs.
For competitive products, we rarely get the component-level BOM, and we have to do it the old-fashioned way: purchase a product and count the parts. We then build reliability projections using the same methods as we use for our own products. This gives us a common baseline for product reliability comparison. Before you ask, no I won't share the detailed results of these comparisons with you. Suffice to say, the Sun FireTM CoolThreads Servers kick serious butt in performance, price, and especially reliability.
When I present this slide I often notice that many people are surprised that there are so many parts in a modern server. In truth, some components, like capacitors, are everywhere. In general, capacitors are very reliable and are often used to filter unwanted signals – a good thing. A modern, enterprise-class server may have hundreds or thousands of capacitors. Over time, more and more functions become integrated into fewer parts. At one extreme is the UltraSPARCTM T1 processor itself which is essentially 8 (or 32, depending on how you count) processors and 4 memory controllers integrated into one chip. But integration is occurring everywhere – including the new I/O ASIC, integrated RAID controllers, system controllers, and network interfaces. A quick browse through the pictures in the Sun FireTM T1000 and T2000 Server Architecture white paper (see pages 20 and 23) should give you an indication of how highly integrated these servers really are. Or, just get your hands on one, and open the cover. You might glean some value from the full components list, though that is really just the FRU components list – not the component-level BOM we use for reliability projections. We've also reduced the parts count in the power supplies, as I have blogged about previously. We can't quite put everything into one chip yet, but we're getting closer, and you can expect even more integration in the next version of the UltraSPARC T1 processor, code-named Niagara-2. I have no doubts that we will continue to drive high reliability and high integration.
[ T: NiagaraCMT ]