Home > other >  schedule() affecting private thread variables in OpenMP #pragma omp parallel for
schedule() affecting private thread variables in OpenMP #pragma omp parallel for

Time:01-13

I've been following and modifying a tutorial for OpenMP in C/C to demonstrate/ understand how schedule() works in #pragma omp parallel for. This is my code:

#include <unistd.h>
#include <stdlib.h>
#include <omp.h>
#include <stdio.h>

#define THREADS 4
#define N 100

int main ( ) {
  int i;
  int perThread=0;
  printf("Running %d iterations on %d threads.\n", N, THREADS);
  #pragma omp parallel for num_threads(THREADS) private(perThread) //schedule(static)
  for (i = 0; i < N; i  ) {
    perThread  ;
    printf("Thread: %d\t loops: %d\n", omp_get_thread_num(), perThread);
    usleep(10000); // to slow the process down a bit 

    //Uncomment below to simulate one thread taking longer on each loop 
    // if(omp_get_thread_num()==1)
    //   sleep(1);
  }
  
  // all threads done
  printf("All done!\n");
  return 0;
}

I saved it as "schedule_example.cpp" and compiled it with:

g   schedule_example.cpp -fopenmp -o SheduleEx

I then compared it with line 13 schedule(static) uncommented and again with various options for schedule(), i.e. schedule(static,25) schedule(static,5) schedule(dynamic) schedule(dynamic,5) schedule(runtime)

The scheduler works and the code demonstrates the difference (particularly when lines 20 and 21 are uncommented.)

The problem is that for some but not all options of schedule() the starting value of perThread is changed for some but not all threads, which can be seen in the printed output.

I've run the code on a few different machines and they've all shown similar results. I used WSL on my Windows 10 laptop, g --version returns: g (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0.

Using the final 8 lines as an example. If schedule() is commented out or using schedule(static) the output is correct:

 Thread: 2 loops: 24
 Thread: 0 loops: 24
 Thread: 1 loops: 24
 Thread: 3 loops: 24
 Thread: 2 loops: 25
 Thread: 1 loops: 25
 Thread: 0 loops: 25
 Thread: 3 loops: 25
 All done!

But if I use anything else for shedule() even schedule(static,25) which should give the same result, it compiles and runs but the last few lines of output are:

 Thread: 1 loops: 24
 Thread: 3 loops: 24
 Thread: 0 loops: 22010
 Thread: 2 loops: 24
 Thread: 1 loops: 25
 Thread: 3 loops: 25
 Thread: 0 loops: 22011
 Thread: 2 loops: 25
 All done!

The problem is the starting value of perThread has been set to 1986 but only for thread 0.

If I rerun it, without recompiling, I have similar results, always thread 0 that's wrong and by about 22000 but not the same amount each time. If I recompile before reruning it gives the same results.

I then ran the same code on Raspberry Pi and got similar but slightly different results. g --version returns: g (Raspbian 10.2.1-6 rpi1) 10.2.1 20210110

The Raspberry Pi only prints out the correct value for loops on all threads if schedule(dynamic) or schedule(dynamic, X) is used - I tried 1, 5, and 25 as values for X.

If (static) or (static, X) is used then all the threads except thread 0 have a starting value of around 67321, this number is always the same for thread 1, 2, and 3, and is often but not always the same between successive runs of the code. (auto) behaves the same as (static). However, (runtime) is opposite to (static), only thread 0 that is wrong, but is also about 67481 off - however when when running it a few times in a row it was the same amount wrong each time.

I ran the same code again on a different PC with Arch Linux and got similar results to the Windows 10 laptop.

In terms of an actual question, is there something I'm doing wrong with how I've written the code? Is there a way to ensure the thread's variables aren't changed?

Sorry it's such a long post but I think the core of the issue is that schedule() is somehow affecting the variable in private() for some of the threads at the beginning of the parallel for loop, some of the time.

Thank you

CodePudding user response:

You should use firstprivate(perThread) instead of private(perThread). Using private clause your private variable is declared, but not initialized, so its value is undefined.

In OpenMP specification you can read that

the firstprivate clause declares one or more list items to be private to a task, and initializes each of them with the value that the corresponding original item has when the construct is encountered.

so you have to use this clause.

  •  Tags:  
  • Related