Thursday Nov 20, 2008
Wednesday Nov 19, 2008
By Josh Simons on Nov 19, 2008
Yesterday we officially announced that Sun will be supplying Sandia National Laboratories its next generation clustered supercomputer, named Red Sky. Douglas Doerfler from the Scalable Architectures Department at Sandia spoke at the Sun HPC Consortium Meeting here in Austin and gave an overview of the system to assembled customers and Sun employees. As Douglas noted, this was the world premiere Red Sky presentation.
The system is slated to replace Thunderbird and other aging cluster resources at Sandia. It is a Sun Constellation system using the Sun Blade 6000 blade architecture, but with some differences. First, the system will use a new diskless two-node Intel blade to double the density of the overall system. The initial system will deliver 160 TFLOPs peak performance in a partially populated configuration with expansion available to 300 TFLOPs.
Second, the interconnect topology is a 3D torus rather than a fat-tree. The torus will support Sandia's secure red/black switching requirement with a middle "swing" section that can be moved to either the red or black side of the machine as needed with the required air gap.
Primary software components include CentOS, Open MPI, OpenSM, and Lash for deadlock-free routing across the torus. The filesystem will be based on Lustre. oneSIS will be used for diskless cluster management, including booting over InfiniBand.
Monday Nov 17, 2008
By Josh Simons on Nov 17, 2008
About ten years ago the HPC community attempted to embrace Java as a viable approach for high performance computing via a forum called Java Grande. That effort ultimately failed for various reasons, one of which was the difficulty of achieving acceptable performance for interesting HPC workloads. Today at the HPC Consortium Meeting here in Austin, Professor Denis Caromel from the University of Nice made the case that Java is ready now for serious HPC use. He described the primary features of ProActive Java, a joint project between INRIA and University of Nice CNRS, and provided some performance comparisons against Fortran/MPI benchmarks.
As background, Denis explained that the goal of ProActive is to enable parallel, distributed, and multi-core solutions with Java using one unified framework. Specifically, the approach should scale from a single, multi-core node to a large, enterprise-wide grid environment.
ProActive embraces three primary areas: Programming, Optimizing, and Scheduling. The programming approach is based on the use of active objects to create a dataflow-like asynchronous communication framework in which objects can be instantiated in either separate JVMs or within the same address space in the case of a multi-core node. Objects are instantiated asynchronously on the receiver side and then represented immediately on the sender side by "future objects" which will be populated asynchronously when the remote computation completes. Accessing future events whose contents have not yet arrived causes a "wait by necessity" which implements the dataflow synchronization mechanism.
ProActive also supports a SPMD programming style with many of the same primitives found in MPI -- e.g., barriers, broadcast, reductions, scatter-gather, etc.
Results for several NAS parallel benchmarks were presented, in particular CG, MG, and EP. On CG, the ProActive version performed at essentially the same speed as the Fortran/MPI version over a range of problem sizes from 1-32 processes. Fortran did better on MG and this seems to relate to issues around large memory footprints, which the ProActive team is looking at in more detail. With EP, Java was faster or significantly faster in virtually all cases.
Work continues to lower messaging latency, to optimize in-node data transfers by sending pointers rather than data, and to reduce message-size overhead.
When asked how ProActive compares to X10, Denis pointed out that while X10 does share some concepts with ProActive, X10 is a new language while ProActive is designed to run on standard Java JVMs and to enable to use of standard Java for HPC.
A full technical paper about ProActive in PDF format is available here.
By Josh Simons on Nov 17, 2008
Sunday Nov 16, 2008
By Josh Simons on Nov 16, 2008
Karl Schulz – Associate Director, HPC, Texas Advanced Computing Center gave an update on Ranger, including current usage statistics as well as some of the interesting technical issues they've confronted since bringing the system online last year.
Karl started with a system overview, which I will skip in favor of pointing to an earlier Ranger blog entry that describes the configuration in detail. Note, however, that Ranger is now running with 2.3 GHz Barcelona processors.As of November 2008, Ranger has more than 1500 allocated users who represent more than 400 individual research projects. Over 300K jobs have been run so far on the system, consuming a total of 220 million CPU hours.
When TACC brought their 900 TeraByte Lustre filesystem online, they wondered how long it would take to fill it. It took six months. Just six months to generate 900 TeraBytes of data. Not surprising, I guess, when you hear that users generate between 5 and 20 TeraBytes of data per day on Ranger. Now that they've turned on their file purging policy files currently currently reside on the filesystem for about 30 days before they are purged, which is quite good as supercomputing centers go.
Here are some of the problems Karl described.
OS jitter. For those not familiar, this phrase refers to a sometimes-significant performance degradation seen by very large MPI jobs that is caused by a lack of natural synchronization between participating nodes due to unrelated performance perturbations on individual nodes. Essentially some nodes fall slightly behind, which slows down MPI synchronization operations, which can in turn have a large effect on overall application performance. The worse the loss of synchronization, the longer certain MPI operations take to complete, and the larger the overall application performance impact.
A user reported bad performance problems with a somewhat unusual application that performed about 100K MPI_AllReduce operations with a small amount of intervening computation between each AllReduce. When running on 8K cores, a very large performance difference was seen when running 15 processes per node versus 16 processes per node. The 16-process-per-node runs showed drastically lower performance.
As it turned out, the MPI implementation was not at fault. Instead, the issue was traced primarily to two causes. First, an IPMI daemon that was running on each node. And, second, another daemon that was being used to gather fine-grained health monitoring information to be fed into Sun Grid Engine. Once the IPMI daemon was disabled and some performance optimization work was done on the health daemon, the 15- and 16-process runs showed almost identical run times.
Karl also showed an example of how NUMA effects at scale can cause significant performance issues. In particular, it isn't sufficient to deal with processor affinity without also paying attention to memory affinity. Off-socket memory access can kill application performance in some cases, as in the CFD case shown during the talk.
By Josh Simons on Nov 16, 2008
When Karl Schulz, Assistant Director at TACC spoke today at the HPC Consortium Meeting, he asked everyone to do their part--within legal limits--to help Keep Austin Weird. Having been a part of the HPC community for many years, I'm pretty sure we are collectively more than up to the task. The phrase "core competency" comes to mind.
Saturday Nov 15, 2008
By Josh Simons on Nov 15, 2008
When the band at this evening's HPC Consortium dinner event invited guests to join them on stage to do some singing, I didn't think anything of it...until some time later when I heard a new voice and turned to see someone up there who sounded good and looked like he was having a good time. He was wearing a conference badge, but I couldn't see whether he was a customer or Sun employee.
Since I was taking photos, it was easy to snap a shot and then zoom in on his badge to see who it was. Imagine my surprise as the name came into focus and I saw that it was Gregg TeHennepe, one of our customers from the Jackson Laboratory where he is a senior manager and research liaison. I was surprised because I hadn't realized it was Gregg in spite of the fact that I had eaten breakfast with him this morning and talked with him several times during the day about the fact that he intends to blog the HPC Consortium meeting, which so far as I know marks the first time a customer has blogged this event.
My surprise continued, however. When I googled Gregg just now to remind myself of the name of his blog, I found he is a member of Blue Northern, a five-piece band from Maine that traditional, original, and contemporary acoustic music. So, yeah. I guess he does sound good.
Gregg's blog is Mental Burdocks. I'll save you the trip to the dictionary and tell you that a burdock is one of those plants with hook-bearing flowers that get stuck on animal coats (and other kinds of coats) for seed dispersal.
By Josh Simons on Nov 15, 2008
I arrived last night in Austin for the Sun HPC Consortium meeting this weekend and for Supercomputing '08 next week. I joined several colleagues for a casual dinner at a local home hosted by Deirdré and her daughter Ross and attended by several of Ross' friends. It was a fun and relaxing way to ease into this trip, which year to year proves to be pretty exhausting since the Consortium runs all weekend and is followed immediately by Supercomputing, which we can count on to deliver a week of high-energy, non-stop sensory overload.
Thanks to Deirdré for the invitation and to Ross, Mo, Trishna, April, Griffin, and Terry for the entertaining evening and the extremely wide-ranging conversation (wow!) And a special thanks to Ross for hosting and for introducing me to a Sicilian pesto that's to die for! YUM. In return, I hope they enjoyed my "opening the apple" demo.
Our internal training session for Sun's HPC field experts (HPC ACES) will finish in about an hour, at which point we will break for lunch and the return for the start of the HPC Consortium meeting.
Let the games begin!
- Mirror, Mirror
- DTrace Deep Dive in Boston this Week!
- Sun Microsystems Alumni
- Rest in Peace
- Barbie's Next Career
- Igniting the Earth's Atmosphere
- Virtualization for HPC: The Heterogeneity Issue
- Sun Grid Engine: Still Firing on All Cylinders
- Sun HPC Consortium Videos Now Available
- You Put Your HPC Cluster in a...WHAT??