(HPC) Challenges to Exascale Super-computing



After the recently concluded HPC/International Supercomputing conference, there is quite a bit of talk about Exa-scale computing. The idea here is to push Supercomputing into the realm of sustainable Exaflop computation (by 2018). [Lets ignore for a moment that its pretty hard to sustain even the current Petaflop levels, a problem that will no doubt be solved in next few years ... ]

Jack Dongarra, a leader in the area of Supercomputing (and co-creator of leading mathematical packages such as Linpack, EISpack, etc), recently gave an interview on this topic which I think makes for an interesting read (you can read the full interview here). The salient points are interesting, and I'm listing here a few that I found most worth pondering over:
  • Going from Petascale to Exascale will mean going from hundreds of thousands of threads to billions of threads
    • This shift is similar to the shift from vector programs to parallel programming
    • The strategy used to achieve petascale will no longer scale to the exascale level, so programs will need to be redesigned
    • Programs will need to have asynchronous handling built in.
  • Exascale programs/machines will essentially be hybrids and purely MPI or loop-based programs will no longer be viable for this scale. Thus a fork-join model will no longer work
  • Memory is going to play at least as big a factor as CPU. For costs, for heat considerations and for latency/computational issues.
  • Programs will have to build in fault-tolerance. At that scale, something is bound to fail. And you can restart using checkpointing
  • Machines will be both lightweight parallel (Blue-gene style of lots of simple threads) or commodity processors with GPU accelerators.
  • International cooperation is a must. Government (and international) involvement of bodies like G-8 will be critical drivers.
  • Community will drive development into vendors.
This is an ambitious and complex goal and the journey will be interesting to follow as much for the human pursuit as it is for the technical pursuit. As a major HPC vendor, Sun systems group (inside Oracle) is watching and following these developments very closely. Compilers and tools are an integral part of such a pursuit; they have always been and will continue to be critical.

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

I have worked with Sun and Oracle for 25 years now; in compilers and tools organization for most of these years followed by a couple of years in Cloud Computing. I am now in ISV Engineering, where our primary task is to improve synergy between Oracle Sun Systems and our rich ISV ecosystem

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Interesting Links