Kicking Butt with OpenMP: The Power of CMT
By Josh Simons on Apr 10, 2008
Yesterday we launched our 3rd generation of multicore / multithreaded SPARC machines and again the systems should turn some heads in the HPC world. Last October, we introduced our first single-socket UltraSPARC T2 based systems with 64 hardware threads and eight floating point units. I blogged about the HPC aspects here. These systems showed great promise for HPC as illustrated in this analysis done by RWTH Aachen.
We have now followed that with two new systems that offer up to 128 hardware threads and 16 floating point units per node. In one or two rack units. With up to 128 GB of memory in the 2U system. These systems, the Sun SPARC Enterprise T5140 and T5240 Servers, both contain two UltraSPARC T2plus processors, which maintain coherency across four point-to-point links with a combined bandwidth of 50 GB per second. Further details are available in the Server Architecture whitepaper.
An HPC person might wonder how a system with a measly clock rate of 1.4 GHz and 128 hardware threads spread across two sockets might perform against a more conventional two-socket system running at over 4 GHz. Well, wonder no more. The OpenMP threaded parallelization model is a great way to parallelize codes on shared memory machine and good illustration of the value proposition of our throughput computing approach. In short, we kick butt with these new systems on SPEComp and have established a new world record for two-socket systems. Details here.
Oh, and we also just released world record two-socket numbers for both SPECint_rate2006 and SPECfp_rate2006 using these new boxes. Check out the details here.