(HPC) Challenges to Exascale Super-computing
By Vtatkar-Oracle on Jun 10, 2010
After the recently concluded HPC/International Supercomputing conference, there is quite a bit of talk about Exa-scale computing. The idea here is to push Supercomputing into the realm of sustainable Exaflop computation (by 2018). [Lets ignore for a moment that its pretty hard to sustain even the current Petaflop levels, a problem that will no doubt be solved in next few years ... ]
Jack Dongarra, a leader in the area of Supercomputing (and co-creator of leading mathematical packages such as Linpack, EISpack, etc), recently gave an interview on this topic which I think makes for an interesting read (you can read the full interview here). The salient points are interesting, and I'm listing here a few that I found most worth pondering over:
- Going from Petascale to Exascale will mean going from hundreds of thousands of threads to billions of threads
- This shift is similar to the shift from vector programs to parallel programming
- The strategy used to achieve petascale will no longer scale to the exascale level, so programs will need to be redesigned
- Programs will need to have asynchronous handling built in.
- Exascale programs/machines will essentially be hybrids and purely MPI or loop-based programs will no longer be viable for this scale. Thus a fork-join model will no longer work
- Memory is going to play at least as big a factor as CPU. For
costs, for heat considerations and for latency/computational issues.
- Programs will have to build in fault-tolerance. At that scale, something is bound to fail. And you can restart using checkpointing
- Machines will be both lightweight parallel (Blue-gene style of
lots of simple threads) or commodity processors with GPU accelerators.
- International cooperation is a must. Government (and international) involvement of bodies like G-8 will be critical drivers.
- Community will drive development