Video Lectures on Parallel Programming
By Ruud on Dec 05, 2010
A series of 7 short video introductions into parallel programming are now available on line. They can be viewed here. Each video is about 10-15 minutes long and covers a specific topic, or topics. Although it is recommended to view them in sequence, this is not required.
The slides are available as well and can be downloaded from the same web page.
Readers interested in these topics may also want to take a look at the "Parallel Programming with Oracle Developer Tools" white paper. There is little overlap between the two. The videos cover the big picture, whereas the paper goes into much more detail on the programming aspects.
The first video sets the stage. It covers the more general topic of application performance tuning, summarizing the different ways how to optimize an application, with parallelization being one of them.
The second video is about multicore architectures. The main purpose of this talk is to underline that multicore is here today and supported by all major microprocessor manufacturers. Admittedly, the processors covered here have been replaced by their respective successors, but that is in the nature of the beast. This topic is after all a moving target, since new microprocessors appear on the horizon all the time. For the purpose of the talks given here, the processor details are not relevant however. Of course there is an impact on the performance, but when parallelizing an application, the details of the processor architecture need not be taken into account.
The most important thing to realize is that there are three features that are going to stay for a long time to come. First of all, multicore is not going to go away; developers can count on any new general purpose microprocessor to support parallel execution in hardware. Secondly, not only does the number of threads continue to increase, the cost of a thread continues to come down too.
The next video covers parallel architectures. There are even more different architectures than processors. This is why the systems described here are generic and covered at the block diagram level. The goal is to help understand at a high level what the differences between an SMP system, a single/multi core cluster and a cc-NUMA system are.
Video number 4 is another conceptual talk, but with an exclusive focus on parallel programming. This is where you'll find more about topics like threads, parallel overheads and Amdah'ls law.
By then, all necessary information to get started writing a parallel program has been covered and it is high time to dig deeper into parallel programming.
The fifth video covers the Message Passing Interface (MPI) in some detail. This is a widely used distributed memory model, targeting a cluster of systems. MPI has been around for quite some time, but is still alive and kicking. Many recent other distributed memory programming models (e.g. CUDA) rely on the same principles. This is why the information presented here is of more general use than to those only interested in using MPI.
The next video is about shared memory parallel programming, starting with automatic parallelization. Often overlooked, but absolutely worth giving a try. The mechanism is simply activated by using the appropriate option on the compiler. Success or failure depends on many factors, but compiler technology continues to improve and can handle increasingly complex code structures.
OpenMP is the second shared memory model covered in this video. It is a mature and easy to use directive based model to explicitly parallelize applications for multicore based systems. Due to the multicore revolution, interest in OpenMP has never been as strong as it is today.
Clusters of multicore based systems are more and more common. The question is how to program them. This is where the Hybrid Parallel Programming model comes into the picture. It is the topic of the short 7-th video. With a Hybrid model, a distributed memory model (like MPI) is used to parallelize the application across the nodes of the cluster. Within the node one can either use MPI (or a similar model) again, but since the overhead of such models tends to be relatively high, a more lightweight model like OpenMP (or a native threading model like POSIX threads) is often more efficient.
This last video only touches upon this important and interesting topic. Those interested to learn much more about it may want to read the appropriate chapters in my white paper on parallel programming.
The title of this 7-th talk includes "What's Next?". It is very hard to predict what's down the road two or more years from now, but it is very safe to assume that parallel computing is here to stay and will only get more interesting. Anybody developing software is strongly advised to look into parallel programming as a way to enhance the performance of his or her application.
I would like to acknowledge Richard Friedman and Steve Kimmey at Oracle, as well as Deirdré Straughan (now at Joyent) for their support creating these videos and to make them available to a wide audience.