How to run two loops from two threads one by one, like a flip flop?-CodePudding

I have a question which similarly answered here but it is not exactly what I need. I have two threads, each has a loop. Now I want to force two threads to work like a flip flop. exactly like this ABABAB or BABABA... it is not important who start first but must work one by one.

There is a very easy code I have, but it does not work well because the thread A iterates super fact and takes the lock again. Please help me as I am learning C multi-threading.

1- in the above link it is said that maybe it's not best approach to have two threads. Assume it is a game and I must run one iteration for player A and one iteration for Player B. I agree it does not give me much better efficiency because at each moment only one of them is working, I want to learn if there is a way.

        int pointA , pointB;
        void testA()
        {
            int i = 0;
            while (i < 10)
            {
                unique_lock<std::mutex> lck(mtx);
                cout << pointB << endl;
                pointA  ;
                i  ;
            }
        
        }
        
        void main()
        {
            int i = 0;
            pointA =100, pointB=0;
            thread t(testA);
            while (i < 10)
            {
                unique_lock<std::mutex> lck(mtx);
                cout << pointA << endl;
                pointB  ;
                i  ;
            }
    
        t.join();
    }

CodePudding user response：

Without using the standard and if same code in multiple threads is allowed, you can branch the flow by a variable:

// in both threads
unique_lock<std::mutex> lck(mtx);
if(var && myId == "A")
{
    // stuff that A does
    var = false;
}
else if(!var && myId == "B")
{
    // stuff that B does
    var = true;
}

but this would be slow because there are other cases where id values do not match the variable condition and checking every case makes it even more slower.

C has something to help on this:

std::condition_variable

by using condition variable, you can have a different condition per thread automatically triggered to stop waiting:

std::condition_variable cv;
...
std::unique_lock lk(mtx);
cv.wait(lk, []{return your_logic();});

Since it just waits, it does not waste CPU cycles like the first example. Unnecessary waking-up/processing gets lower and memory bandwidth is not wasted either.

More implicit way of combining outputs from 2 threads would be using two thread-safe queues, one from A to B, one from B to output:

// assuming the implementation blocks .front() until it is filled
ThreadSafeQueue q1;
ThreadSafeQueue q2;

// in thread A
for(int i=0;i<10;i  )
    q1.push("A"); 

// in thread B
for(int i=0;i<10;i  )
{
    q2.push(q1.front() "B");
    q1.pop();
}

// in main thread
auto result = q2.front();  // "AB"
q2.pop();

With this pattern, thread-B would only work once for each result of thread-A. But this doesn't synchronize the threads. The thread-A could fill queue with 10 "A" values before thread-B processes the 5th "AB" and before main thread gets the 3rd "AB".

To enforce flip-flop-like work in time, you can limit the size of the queues to 1 or 2. Then it would block thread-A until thread-B consumes it and second queue would block thread-B until main thread consumes it.

Yet another way of synchronizing multiple threads for different tasks would be using cyclic-barriers:

[C  20]
std::barrier sync_point(size /*2?*/, func_on_completion);

// in thread A 
..stuff..flip..
sync_point.arrive_and_wait();
..more stuff that needs updated stuff..

// in thread B
..stuff..flop..
sync_point.arrive_and_wait();
..more stuff that needs updated stuff..

the barrier makes sure both threads wait each other before continuing. If this is in a loop, then they will process one step (1 step means both A and B produced at the same time here) at a time while waiting each other before going next iteration. So it will produced ABBAABABBABAAB while never doing more A or more B than the other. If A is always required before B, then you need more barriers to ensure order:

// in thread A and B
if(thread is A) 
    output "A"
sync_point.arrive_and_wait();
if(thread is B) 
    output "B"
sync_point.arrive_and_wait();

this prints ABABABAB...

If you are using OpenMP, it has barrier too:

#pragma omp parallel
{
     ...work...
     #pragma omp barrier
     ...more work...
}

if you don't want second part happened same time as first part of next iteration, you need two barriers:

for(...)
#pragma omp parallel
{
     ...work...
     #pragma omp barrier
     ...more work...
     #pragma omp barrier
}

if order of two threads' work in each iteration is still important, this would require dedicated segments to each thread

for(...)
#pragma omp parallel
{
     if(thread is A?)
         do this
     #pragma omp barrier
     if(thread is B?)
         do that
     #pragma omp barrier
}

this would write ABABAB always, although with decreased efficiency because OpenMP block start/stop overhead is high and measurable in a loop. It would be better to have a loop in each thread instead:

#pragma omp parallel num_threads(2)
{
    // this loop runs same for both threads, not shared/parallelized
    for(int i=0;i<10;i  )
    {
        int id=omp_get_thread_num();
        if(id==0)
            std::cout<<"A"<<std::endl;
        #pragma omp barrier
        if(id==1)
            std::cout<<"B"<<std::endl;
        #pragma omp barrier
    }
}

this outputs ABABABABAB... and has no openmp start/stop overhead (but still barrier overhead exists).

CodePudding user response：

based on this answer and answer above, I managed to write the code. We need one flag to switch the turn between two loops. There is also another way with ready-go approach explained well here, it is in C# but concepts are same:

#include <iostream>
#include <thread>
#include <mutex>

using namespace std;

mutex mutex1;
condition_variable cv3;
char turn;
void ThreadA()
{
    for (int i = 0; i < 1000; i  )
    {
        unique_lock<mutex> lock(mutex1);
        cv3.wait(lock, [] {return (turn == 'A'); });
        cout << "Thread A" << endl;
        turn = 'B';
        cv3.notify_all();
    }
}

void ThreadB()
{
    for (int i = 0; i < 1000; i  )
    {
        unique_lock<mutex> lock(mutex1);
        cv3.wait(lock, [] {return (turn == 'B'); });
        cout << "Thread B" << endl;
        turn = 'A';
        cv3.notify_all();
    }
}

void ExecuteThreads()
{
    turn = 'A';
    std::thread t1(ThreadA);
    std::thread t2(ThreadB);

    t1.join();
    t2.join();

    std::cout << "Finished" << std::endl;
}

int main()
{
    ExecuteThreads();

    return 0;
}