X
  • Work
    August 28, 2006

An OpenMP example

Guest Author

OpenMP is a way of using parallelism in an application. It takes the form of 'pragmas' that you add to the source code. There are two advantages of this approach over using pthreads.

  • The changes to the source code can be made incrementally. So you can just add a line to the one bit of code that you know could be parallelised, whilst leaving the rest of the code untouched.
  • The pragmas are only enabled if the code is compiled with the -xopenmp compiler flag. This means that you can make a parallel and a serial version of the code from the same source base.
  • Very few lines of text are needed to move from a serial code to a multi-threaded code

Here's an example of a short code to perform a summation using multiple threads

#include 
#include
void main()
{
int i,j;
int \*array;
int total=0;
array=(int\*)calloc(10\*1024\*1024,sizeof(int));
for (j=0; j < 100; j++)#pragma omp parallel for reduction(+:total)
for (i=0; i < 10\*1024\*1024; i++)
total+=array[i];
printf("Total = %10i\\n",total);
}

The critical line in the program is the pragma. It is (hopefully) easy to read, but probably bears some explanation. The pragma says that it is an OpenMP construct, and that the following for statement should be done in parallel. It then lists the variable total as being a reduction summation; this is probably the most complex step. The variable total needs to be updated by all the threads, and if this variable were shared between the threads it would have to be protected by a mutex. To get around this, the variable is declared as a reduction, which enables each thread to have its own copy, at the end of the parallel region, these copies are added together and placed into the 'real' variable total, and this value is reported as the result of the code.

To compile this program, you need to use the -xopenmp flag.

cc -O -xopenmp -o scale scale.c

The environment variable OMP_NUM_THREADS determines how many threads are used by the program. Here's the timing data from running this at various numbers of threads:

$ setenv OMP_NUM_THREADS 1
$ timex scale
Total = 0
real 5.05
user 4.94
sys 0.07
$ setenv OMP_NUM_THREADS 2
$ timex scale
Total = 0
real 2.62
user 4.94
sys 0.08
$ setenv OMP_NUM_THREADS 4
$ timex scale
Total = 0
real 1.33
user 4.64
sys 0.12
$ setenv OMP_NUM_THREADS 8
$ timex scale
Total = 0
real 0.74
user 4.45
sys 0.10

In the example, you can see that the user and system time has remained pretty stable - as you might expect, since the same amount of work is being done. However the realtime has reduced linearly with the number of processors.

This is a very simple example, the point being to illustrate the easy with which OpenMP statements can be mixed into existing code, and also the kind of scaling that is possible with MP systems.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.