OpenMP integer copied after tasks finish-CodePudding

I do not know if this is documented anywhere, if so I would love a reference to it, however I have found some unexpected behaviour when using OpenMP. I have a simple program below to illustrate the issue. Here in point form I will tell what I expect the program to do:

I want to have 2 threads
They both share an integer
The first thread increments the integer
The second thread reads the integer
Ater incrementing once, an external process must tell the first thread to continue incrementing (via a mutex lock)
The second thread is in charge of unlocking this mutex

As you will see, the counter which is shared between the threads is not altered properly for the second thread. However, if I turn the counter into an integer refernce instead, I get the expected result. Here is a simple code example:

#include <mutex>
#include <thread>
#include <chrono>
#include <iostream>
#include <omp.h>

using namespace std;
using std::this_thread::sleep_for;
using std::chrono::milliseconds;

const int sleep_amount = 2000;

int main() {

  int counter = 0; // if I comment this and uncomment the 2 lines below, I get the expected results
  /* int c = 0; */
  /* int &counter = c; */

  omp_lock_t mut;
  omp_init_lock(&mut);
  int counter_1, counter_2;

#pragma omp parallel
#pragma omp single
  {
#pragma omp task default(shared)
// The first task just increments the counter 3 times
    {
      while (counter < 3) {
        omp_set_lock(&mut);
        counter  = 1;
        cout << "increasing: " << counter << endl;
      }
    }
#pragma omp task default(shared)
    {
      sleep_for(milliseconds(sleep_amount));
      // While sleeping, counter is increased to 1 in the first task
      counter_1 = counter;
      cout << "counter_1: " << counter << endl;

      omp_unset_lock(&mut);
      sleep_for(milliseconds(sleep_amount));
      // While sleeping, counter is increased to 2 in the first task
      counter_2 = counter;
      cout << "counter_2: " << counter << endl;
      omp_unset_lock(&mut);
      // Release one last time to increment the counter to 3
    }
  }
  omp_destroy_lock(&mut);

  cout << "expected: 1, actual: " << counter_1 << endl;
  cout << "expected: 2, actual: " << counter_2 << endl;
  cout << "expected: 3, actual: " << counter << endl;
}

Here is my output:

increasing: 1
counter_1: 0
increasing: 2
counter_2: 0
increasing: 3
expected: 1, actual: 0
expected: 2, actual: 0
expected: 3, actual: 3

gcc version: 9.4.0

Additional discoveries:

If I use OpenMP 'sections' instead of 'tasks', I get the expected result as well. The problem seems to be with 'tasks' specifically
If I use posix semaphores, this problem also persists.

CodePudding user response：

This is not permitted to unlock a mutex from another thread. Doing it causes an undefined behavior. The general solution is to use semaphores in this case. Wait conditions can also help (regarding the real-world use cases). To quote the OpenMP documentation (note that this constraint is shared by nearly all mutex implementation including pthreads):

A program that accesses a lock that is not in the locked state or that is not owned by the task that contains the call through either routine is non-conforming.
A program that accesses a lock that is not in the uninitialized state through either routine is non-conforming.

Moreover, the two tasks can be executed on the same thread or different threads. You should not assume anything about their scheduling unless you tell OpenMP to do so with dependencies. Here, it is completely compliant for a runtime to execute the tasks serially. You need to use OpenMP sections so multiple threads execute different sections. Besides, it is generally considered as a bad practice to use locks in tasks as the runtime scheduler is not aware of them.

Finally, you do not need a lock in this case: an atomic operation is sufficient. Fortunately, OpenMP supports atomic operations (as well as C ).

Additional notes

Note that locks guarantee the consistency of memory accesses in multiple threads thanks to memory barriers. Indeed, an unlock operation on a mutex cause a release memory barrier that make writes visible from others threads. A lock from another thread do an acquire memory barrier that force reads to be done after the lock. When lock/unlocks are not used correctly, the way memory accesses are done is not safe anymore and this cause some variable not to be updated from other threads for example. More generally, this also tends to create race conditions. Thus, put it shortly, don't do that.