POWER6 Goes Thud: Part VII
By jmeyer on Nov 20, 2007
Still Missing After Six Months ...
It's been three long months since my last blog entry (shame on me!). And once again, I've been criss-crossing the country evangelizing SPARC. I've also pledged to dedicate more time talking about SPARC developments in this blog going forward, since there's so much to talk about. So it's time for a nice long Thanksgiving break, but before I take off, I had to take one more advantage of IBM's slip-ups and update you on something I know IBM will be loathe to remind you of.
Today marks the semi-anniversary of their failure to deliver on their POWER6 announcement. Once again, here's what the POWER6-upgradable server line looked like on May 21, 2007 and still looks like today, HALF A YEAR LATER:
Yes, I'm well aware of their announcement of the new JS22 blade, but I don't count fractions of servers here. At 4 DIMM slots, one disk drive, one PCIe slot, and no on-board environmental monitoring, it's at most half a blade server\*. And true to form, IBM announced it months before you can actually buy it.
And what about POWER6 performance? You would think that after all this time, and after all of IBM's marketing hype about sky-high clock frequencies, they would have some numbers to show for it. Not so. The masked, enigmatic BMSeer has, over the last six months, debunked the more outrageous performance claims made by IBM's benchmarketing machine, and recently pointed out all of the benchmarks IBM is terrified to run on POWER6, even after half a year.
Contrast that with the performance leadership of the Sun SPARC Enterprise (a.k.a. "APL") servers. Since announcement, we've set 36 world performance records on these systems, 13 of which are still current at the time of this writing. The new UltraSPARC™ T2-based Sun SPARC Enterprise T5x20 servers alone have set 9 of them.
And we're not cherry-picking, either. I'm talking about a wide variety of workloads: CPU performance (best single-chip SPECint_rate2006 and SPECfp_rate2006 results), bandwidth (best STREAM results), web server (best SPECweb2005 results), app server (best SPECjAppServer2004 results), Java (best SPECjbb2005 results), cryptography (37,000 RSA 1024-bit signs/sec and up to 38.9 Gbit/sec of AES-128 throughput), SAP (best single-socket SAP SD 2-tier result and best 16-processor SAP SD two-tier result), database (best iGEN-OLTP results), and HPC (best LINPACK result, best SPECompL2001 result, best 32-thread SPECompM2001 result, and best single-chip SPECompM2001 result).
This last one is especially important, because if there's one area where a high-frequency single-threaded performer should beat SPARC, it should be the HPC apps. So why did we beat IBM's best POWER6 LINPACK result by 12% with our Sun SPARC Enterprise M9000? And, as reported in a recent internetnews.com article, if almost half of the supercomputers in the Top 500 list belong to IBM, how come not a single one is a POWER5- or POWER6-based system? "IBM did this because it needed a cooler chip [the older, slower PowerPC 440 chip] to achieve the density required ..."
There's another problem with this interminable schedule slip. IBM's own customers tell us that they are fed up with waiting so long for IBM to fill out a product line that requires re-qualifying, re-testing, and re-certifying in their datacenters with each new chip generation, only to have that generation end a year or two after the gaping holes in the product line are filled.
I'll re-iterate that Sun announced the Sun SPARC Enterprise server line in its entirety in April and that we achieved revenue release and general availability of the entire product line within a few weeks of each other in the last couple of months. That is clearly the record of a product line that's ready for mission-critical deployment. That's ready for prime time.
So why can't IBM do something with POWER6 that we do all the time with SPARC? Simple: they are struggling with power and cooling issues brought on by a disastrous decision made years ago to persue clock frequency at all costs. But as the rest of the industry has figured out, scaling performance with clock frequency is an evolutionary dead-end. And if you're dazzled by the performance numbers claimed by IBM's benchmarketers (despite their shenanigans), you'll have plenty of time to get familiar with them. POWER6 can only offer further small incremental performance gains from ever-more-ridiculous clock frequencies, and I'll believe rumors about a quad-core POWER6 when I see it. POWER7 isn't due until 2010 or 2011 at least and there's no indication that it will make any serious break from the design mistakes of the past. In contrast, by my count, Sun will be introducing at least six new processor upgrades in three binary-compatible SPARC product lines in the same period of time, each targeted to optimizing different application workloads.
When will IBM understand that betting against threads is like betting against Internet bandwidth a decade ago? It's tough to say. They've made statements before that they get this ("You'll see us go more aggressively against threads"), but recently backed out of plans to introduce a massively-threaded processor. Perhaps they figured out that it's a very hard thing to get right. Or maybe they found themselves spending too much time gingerly stepping around the over 100 patents that Sun has on the technology.
So I'm going to make some Thanksgiving-day predictions:
- It will be well into 2008 before IBM can fix its manufacturing problems around POWER6 and deliver a high end. This may be a no-brainer, I know, but the question is: will it be closer to the beginning of the year, or closer to the one-year anniversary of the announcement (or longer)? And I said deliver, not just issue a press release.
- IBM will not be able to manufacture a single- or dual-socket POWER6 rackmount server to follow the p505, p510, or p520 for power and cooling reasons. Instead, they will be forced into underpowered blades like the JS22, or into gargantuan dual-wide blades just to get anything like volume production on POWER6 and to get a server capable of doing any useful I/O. There is another motivation here, and that's to force customers into a proprietary lock-in with IBM's aging BladeCenter chassis.
- There will never be a meaningful POWER6 processor family dedicated to optimizing different datacenter workloads. IBM will have to continue to position a single POWER implementation as a one-size-fits-all solution to their customer's computing needs for the foreseeable future.
- The "Sun leads, IBM follows" technology story around processors and systems will continue, just as it has around AIX and UNIX virtualization.
Sun hasn't had such such a phenomenal competitive advantage over Big Blue in many years. And that's something to be truly thankful for this season.
\* Compare to Sun's new Sun Blade T6320: 8 cores vs. IBM's 4; 16 DIMM slots vs. IBM's 4; 64 hardware threads vs. IBM's 4; 4 hot-pluggable disks vs. IBM's 1 non hot-pluggable; 4 PCIe slots vs. IBM's 1. Don't be fooled by IBM's claim of supporting 32GB on this thing; you'd have to mortgage the Taj Mahal to pay for 8GB DIMMs today.
IBM®, POWER™, POWER4™, POWER5™, POWER6™, pSeries®, System p5™, Redbook®, DB2®, and AIX® are all trademarks of the International Business Machines Corporation in the United States, other countries, or both.
SPEC, SPECint, SPECweb, SPECfp, SPECjAppServer, SPECjbb, and SPEComp, are registered trademarks of the Standard Performance Evaluation Corporation. For more information, go to http://www.tpc.org. For a description of the STREAM benchmark, see McCalpin, John D.: "STREAM: Sustainable Memory Bandwidth in High Performance Computers", a continually updated technical report (1991-2007), available at: http://www.cs.virginia.edu/stream/.