Home > Software engineering >  How should I wait for thread to finish with a timeout using C 11?
How should I wait for thread to finish with a timeout using C 11?

Time:01-28

I have Windows multi-threaded code that I'm trying to make portable by using the C 11 thread and synchronization classes. How should the main thread wait for a worker thread with a timeout? I tried using a condition variable, but it's possible for the main thread to miss the worker's notification by waiting too late. In Windows I would use an "event object" as shown below:

#include "Windows.h"

HANDLE  hWorkerDoneEvent;   // synchronization event

DWORD WINAPI ThreadFunc(LPVOID lpParameter)
{
    printf("worker: launched; working\n");
    Sleep(1000);    // pretend to work
    printf("worker: signal done\n");
    SetEvent(hWorkerDoneEvent); // signal main thread that we're done working
    printf("worker: exit\n");
    return 0;
}

int _tmain(int argc, _TCHAR* argv[])
{
    hWorkerDoneEvent = CreateEvent(NULL, FALSE, FALSE, NULL);   // create auto-reset event
    if (hWorkerDoneEvent == NULL) {
        printf("main: error creating event\n");
        return 0;
    }
    printf("main: launch worker\n");
    if (CreateThread(NULL, 0, ThreadFunc, NULL, 0, NULL) == NULL) { // create worker thread
        printf("main: error launching thread\n");
        return 0;
    }
    printf("main: delay a bit\n");
    Sleep(2000);    // demonstrate that it's impossible to miss the worker's signal by waiting too late
    printf("main: wait for worker's done signal\n");
    if (WaitForSingleObject(hWorkerDoneEvent, 5000) == WAIT_OBJECT_0) { // wait for worker's signal
        printf("main: worker finished normally\n");
    } else {
        printf("main: worker timeout or error\n");
    }
    return 0;
}

The Windows program's output is as expected:

main: launch worker
main: delay a bit
worker: launched; working
worker: signal done
worker: exit
main: wait for worker's done signal
main: worker finished normally

My attempt to replicate the above using C 11:

#include "thread"
#include "chrono"
#include "condition_variable"

using namespace std;

condition_variable  cvWorkerDone;
mutex mtxWorkerDone;

void ThreadFunc()
{
    printf("worker: launched; working\n");
    this_thread::sleep_for(std::chrono::milliseconds(1000));    // pretend to work
    printf("worker: signal done\n");
    cvWorkerDone.notify_all();  // signal main thread that we're done working
    printf("worker: exit\n");
}

int _tmain(int argc, _TCHAR* argv[])
{
    printf("main: launch worker\n");
    thread worker(ThreadFunc);
    printf("main: delay a bit\n");
    this_thread::sleep_for(std::chrono::milliseconds(2000));    // causes timeout because we waited too late and missed worker's notification
    printf("main: wait for worker's done signal\n");
    unique_lock<mutex> lk(mtxWorkerDone);
    if (cvWorkerDone.wait_for(lk, std::chrono::milliseconds(5000))) {
        printf("main: worker finished normally\n");
    } else {
        printf("main: worker timeout or error\n");
    }
    worker.join();
    return 0;
}

The C 11 program's output; note that the main thread times out even though the worker did its work and signaled:

main: launch worker
worker: launched; working
main: delay a bit
worker: signal done
worker: exit
main: wait for worker's done signal
main: worker timeout or error

In the C 11 code, the main thread incorrectly times out because notification isn't sticky, in other words notify_all only works if a thread is already waiting. I understand that std::binary_semaphore would solve my problem, but that's only available in C 20 which I don't currently have access to. And yes, I could put a short wait at the start of the worker to give the main thread time to get to its wait, but that's gross and race-prone and inefficient.

I also tried having the worker lock a timed_mutex while the main thread does a try_lock_for on that same mutex, as shown below, but this is equally wrong (it deadlocks):

#include "thread"
#include "chrono"
#include "mutex"

using namespace std;

timed_mutex mtxWorkerDone;

void ThreadFunc()
{
    printf("worker: launched\n");
    printf("worker: delay a bit\n");
    this_thread::sleep_for(std::chrono::milliseconds(500)); // fools main thread into locking mutex before we do, causing deadlock
    printf("worker: locking mutex\n");
    unique_lock<timed_mutex> lk(mtxWorkerDone);
    printf("worker: mutex locked; working\n");
    this_thread::sleep_for(std::chrono::milliseconds(1000));    // pretend to work
    printf("worker: exit (releasing lock)\n");
}

int _tmain(int argc, _TCHAR* argv[])
{
    printf("main: launch worker\n");
    thread worker(ThreadFunc);
    printf("main: wait for worker's done signal\n");
    unique_lock<timed_mutex> lk(mtxWorkerDone, defer_lock);
    if (lk.try_lock_for(std::chrono::milliseconds(5000))) {
        printf("main: worker finished normally\n");
    } else {
        printf("main: worker timeout or error\n");
    }
    worker.join();
    return 0;
}

The output; the main thread thinks the worker finished normally, but the worker is actually blocked at the unique_lock and never gets to its work.

main: launch worker
worker: launched
worker: delay a bit
main: wait for worker's done signal
main: worker finished normally
worker: locking mutex

To reiterate, I'm looking for a stdlib solution in C 11 that doesn't depend on sleeping to avoid races. I have often ported firmware from Windows to embedded platforms, and often faced similar issues. Windows is the Cadillac of synchronization APIs, with a tool for every occasion. Even twenty years later I still see posts asking how to replace WaitForMultipleObjects. Anyway I expected this to be a PITA and I'm not disappointed. :)

CodePudding user response:

I very well may be missing something, and had to adapt your code a bit as I am doing this quickly on lunch, but I believe this is basically what you are looking for and as far as I can tell is the correct way to go about this. Note the addition of a lambda to the wait() call and the lock_guard in ThreadFunc():

#include "thread"
#include "chrono"
#include "condition_variable"

using namespace std;

condition_variable  cvWorkerDone;
mutex mtxWorkerDone;
bool finished = false;

void ThreadFunc()
{
    printf("worker: launched; working\n");
    this_thread::sleep_for(std::chrono::milliseconds(1000));    // pretend to work
    printf("worker: signal done\n");
    std::lock_guard<std::mutex> lk(mtxWorkerDone);
    finished = true;
    cvWorkerDone.notify_all();  // signal main thread that we're done working
    printf("worker: exit\n");
}

int main(int argc, char* argv[])
{
    printf("main: launch worker\n");
    thread worker(ThreadFunc);
    printf("main: delay a bit\n");
    this_thread::sleep_for(std::chrono::milliseconds(2000));    // causes timeout because we waited too late and missed worker's notification
    printf("main: wait for worker's done signal\n");
    {
    unique_lock<mutex> lk(mtxWorkerDone);
    if (cvWorkerDone.wait_for(lk, std::chrono::milliseconds(5000), []{return finished;})) {
        printf("main: worker finished normally\n");
    } else {
        printf("main: worker timeout or error\n");
    }
    }
    worker.join();
    return 0;
}
  • Related