Home > database >  How is OpenMP communicating between threads with what should be a private variable?
How is OpenMP communicating between threads with what should be a private variable?

Time:05-23

I'm writing some code in C using OpenMP to parallelize some chunks. I run into some strange behavior that I can't quite explain. I've rewritten my code such that it replicates the issue minimally.

First, here is a function I wrote that is to be run in a parallel region.

void foo()
{
    #pragma omp for
    for (int i = 0; i < 3; i  )
    {
        #pragma omp critical
        printf("Hello %d from thread %d.\n", i, omp_get_thread_num());
    }
}

Then here is my whole program.

int main()
{
    omp_set_num_threads(4);
    #pragma omp parallel
    {
        for (int i = 0; i < 2; i  )
        {
            foo();
            #pragma omp critical
            printf("%d\n", i);
        }
    }
    return 0;
}

When I compile and run this code (with g -std=c 17), I get the following output on the terminal:

Hello 0 from thread 0.
Hello 1 from thread 1.
Hello 2 from thread 2.
0
0
Hello 2 from thread 2.
Hello 1 from thread 1.
0
Hello 0 from thread 0.
0
1
1
1
1

i is a private variable. I would expect that the function foo would be run twice per thread. So I would expect to see eight "Hello from %d thread %d.\n" statements in the terminal, just like how I see eight numbers printed when printing i. So what gives here? Why is it that in the same loop, OMP behaves so differently?

CodePudding user response:

It is because #pragma omp for is a worksharing construct, so it will distribute the work among threads and the number of threads used does not matter in this respect, just the number of loop counts (2*3=6).

If you use omp_set_num_threads(1); you also see 6 outputps. If you use more threads than loop counts, some threads will be idle in the inner loop, but you still see exactly 6 outputs.

On the other hand, if you remove #pragma omp for line you will see (number of threads)*2*3 (=24) outputs.

CodePudding user response:

From the documentation of omp parallel:

Each thread in the team executes all statements within a parallel region except for work-sharing constructs.

Emphasis mine. Since the omp for in foo is a work-sharing construct, it is only executed once per outer iteration, no matter how many threads run the parallel block in main.

  • Related