OpenMP is a way of using parallelism in an application. It takes the form of 'pragmas' that you add to the source code. There are two advantages of this approach over using pthreads.
Here's an example of a short code to perform a summation using multiple threads
#include
#include
void main()
{
int i,j;
int \*array;
int total=0;
array=(int\*)calloc(10\*1024\*1024,sizeof(int));
for (j=0; j < 100; j++)#pragma omp parallel for reduction(+:total)
for (i=0; i < 10\*1024\*1024; i++)
total+=array[i];
printf("Total = %10i\\n",total);
}
The critical line in the program is the pragma. It is (hopefully) easy to read, but probably bears some explanation. The pragma says that it is an OpenMP construct, and that the following for statement should be done in parallel. It then lists the variable total as being a reduction summation; this is probably the most complex step. The variable total needs to be updated by all the threads, and if this variable were shared between the threads it would have to be protected by a mutex. To get around this, the variable is declared as a reduction, which enables each thread to have its own copy, at the end of the parallel region, these copies are added together and placed into the 'real' variable total, and this value is reported as the result of the code.
To compile this program, you need to use the -xopenmp flag.
cc -O -xopenmp -o scale scale.c
The environment variable OMP_NUM_THREADS determines how many threads are used by the program. Here's the timing data from running this at various numbers of threads:
$ setenv OMP_NUM_THREADS 1
$ timex scale
Total = 0
real 5.05
user 4.94
sys 0.07
$ setenv OMP_NUM_THREADS 2
$ timex scale
Total = 0
real 2.62
user 4.94
sys 0.08
$ setenv OMP_NUM_THREADS 4
$ timex scale
Total = 0
real 1.33
user 4.64
sys 0.12
$ setenv OMP_NUM_THREADS 8
$ timex scale
Total = 0
real 0.74
user 4.45
sys 0.10
In the example, you can see that the user and system time has remained pretty stable - as you might expect, since the same amount of work is being done. However the realtime has reduced linearly with the number of processors.
This is a very simple example, the point being to illustrate the easy with which OpenMP statements can be mixed into existing code, and also the kind of scaling that is possible with MP systems.