TeraGrid '08, Las Vegas, NV
By Linda Fellingham on Jun 10, 2008
The annual TeraGrid Users' meeting is in Las Vegas this week. Since SuperComputing was in Reno last November and now the TeraGrid meeting is here, it appears that either Nevada is the hub of the HPC universe, or people who build big computers like to gamble.
There were tutorial sessions yesterday. I attended the Data Management and Visualization sessions. The overall theme seemed to be that the biggest problem associated with using TeraGrid resources is moving large data sets around. The Data Management session in the morning outlined ways to move data to a TG compute resource. "Your mileage may vary" was the big theme. It sounds like you CAN transfer data at 600 MB/sec if you do it just the right way with full knowledge of the file system architecture on both sides, BUT you are more likely to get 1 MB/sec. This sounds like a real opportunity for better software to make this easier and faster. "WAN Lustre" and TeraGrid-aware scp were proposed.
Remote visualization using TG resources was one answer to the data-moving problem. Just don't move it, use it where it is. Kelly Gaither from TACC talked about how to use Maverick (uses Shared Visualization software (VirtualGL, TurboVNC, and Sun Grid Engine) for remote visualization and collaboration and Joe Insley from Argonne National Laboratory talked about using remote ParaView on their graphics cluster. One thing I took away from this session was that parallel visualization applications (like ParaVIew) are really hard for the casual scientific user to use. Also, the mechanics of using the visualization applications remotely seemed complex. At least for TACC, we need to do a better job of educating them on how to use the improvements in Shared Viz 1.1 for ease of use. Paul Navratil talked about using the Ranger cluster for visualization. They have users who use ParaView with software rendering (Mesa). This should be greatly improved in performance when we get the hardware-accelerated graphics cluster attached to Ranger!
This morning's keynote address was by Daniel Reed, currently at Microsoft. It was a though-provoking talk. The main themes were how to deal with the data tsuname, available systems shape research agenda (a corollary of the Saper-Whorf hypothesis that language influences the habitual thought of its speakers), and bulk computing is almost free, but software and power are not, moving data is still hard, people are incredibly expenisve, and robust software remains extremely labor-intensive.
Faster processors enable more new software features which result in slower programs which create demand for even faster processors.
Dr. Reed highlighted the impact that multi-core processors will/are having and the problems this will cause in creating software that can take advantage of them to gain performance.
MPI still dominates, but the level of abstraction needs to be raised. He asserted that purpose of the national investment in computer hardware should be to enable science and not to turn researchers into computer scientists.
The other wave of the future that he focussed on is "cloud computing". Microsoft's "cloud" is growing at an exponential rate. Their new data center in Chicago is just under the 200 MW power envelope. He said the physical plant and power are the real cost, hardware is almost free. Currently, funding agencies pay for hardware acquisition, while the institution pays for physical plant and power. This means that the funding model needs to change.