I have Windows multi-threaded code that I'm trying to make portable by using the C 11 thread and synchronization classes. How should the main thread wait for a worker thread with a timeout? I tried using a condition variable, but it's possible for the main thread to miss the worker's notification by waiting too late. In Windows I would use an "event object" as shown below:
#include "Windows.h"
HANDLE hWorkerDoneEvent; // synchronization event
DWORD WINAPI ThreadFunc(LPVOID lpParameter)
{
printf("worker: launched; working\n");
Sleep(1000); // pretend to work
printf("worker: signal done\n");
SetEvent(hWorkerDoneEvent); // signal main thread that we're done working
printf("worker: exit\n");
return 0;
}
int _tmain(int argc, _TCHAR* argv[])
{
hWorkerDoneEvent = CreateEvent(NULL, FALSE, FALSE, NULL); // create auto-reset event
if (hWorkerDoneEvent == NULL) {
printf("main: error creating event\n");
return 0;
}
printf("main: launch worker\n");
if (CreateThread(NULL, 0, ThreadFunc, NULL, 0, NULL) == NULL) { // create worker thread
printf("main: error launching thread\n");
return 0;
}
printf("main: delay a bit\n");
Sleep(2000); // demonstrate that it's impossible to miss the worker's signal by waiting too late
printf("main: wait for worker's done signal\n");
if (WaitForSingleObject(hWorkerDoneEvent, 5000) == WAIT_OBJECT_0) { // wait for worker's signal
printf("main: worker finished normally\n");
} else {
printf("main: worker timeout or error\n");
}
return 0;
}
The Windows program's output is as expected:
main: launch worker
main: delay a bit
worker: launched; working
worker: signal done
worker: exit
main: wait for worker's done signal
main: worker finished normally
My attempt to replicate the above using C 11:
#include "thread"
#include "chrono"
#include "condition_variable"
using namespace std;
condition_variable cvWorkerDone;
mutex mtxWorkerDone;
void ThreadFunc()
{
printf("worker: launched; working\n");
this_thread::sleep_for(std::chrono::milliseconds(1000)); // pretend to work
printf("worker: signal done\n");
cvWorkerDone.notify_all(); // signal main thread that we're done working
printf("worker: exit\n");
}
int _tmain(int argc, _TCHAR* argv[])
{
printf("main: launch worker\n");
thread worker(ThreadFunc);
printf("main: delay a bit\n");
this_thread::sleep_for(std::chrono::milliseconds(2000)); // causes timeout because we waited too late and missed worker's notification
printf("main: wait for worker's done signal\n");
unique_lock<mutex> lk(mtxWorkerDone);
if (cvWorkerDone.wait_for(lk, std::chrono::milliseconds(5000))) {
printf("main: worker finished normally\n");
} else {
printf("main: worker timeout or error\n");
}
worker.join();
return 0;
}
The C 11 program's output; note that the main thread times out even though the worker did its work and signaled:
main: launch worker
worker: launched; working
main: delay a bit
worker: signal done
worker: exit
main: wait for worker's done signal
main: worker timeout or error
In the C 11 code, the main thread incorrectly times out because notification isn't sticky, in other words notify_all only works if a thread is already waiting. I understand that std::binary_semaphore would solve my problem, but that's only available in C 20 which I don't currently have access to. And yes, I could put a short wait at the start of the worker to give the main thread time to get to its wait, but that's gross and race-prone and inefficient.
I also tried having the worker lock a timed_mutex while the main thread does a try_lock_for on that same mutex, as shown below, but this is equally wrong (it deadlocks):
#include "thread"
#include "chrono"
#include "mutex"
using namespace std;
timed_mutex mtxWorkerDone;
void ThreadFunc()
{
printf("worker: launched\n");
printf("worker: delay a bit\n");
this_thread::sleep_for(std::chrono::milliseconds(500)); // fools main thread into locking mutex before we do, causing deadlock
printf("worker: locking mutex\n");
unique_lock<timed_mutex> lk(mtxWorkerDone);
printf("worker: mutex locked; working\n");
this_thread::sleep_for(std::chrono::milliseconds(1000)); // pretend to work
printf("worker: exit (releasing lock)\n");
}
int _tmain(int argc, _TCHAR* argv[])
{
printf("main: launch worker\n");
thread worker(ThreadFunc);
printf("main: wait for worker's done signal\n");
unique_lock<timed_mutex> lk(mtxWorkerDone, defer_lock);
if (lk.try_lock_for(std::chrono::milliseconds(5000))) {
printf("main: worker finished normally\n");
} else {
printf("main: worker timeout or error\n");
}
worker.join();
return 0;
}
The output; the main thread thinks the worker finished normally, but the worker is actually blocked at the unique_lock and never gets to its work.
main: launch worker
worker: launched
worker: delay a bit
main: wait for worker's done signal
main: worker finished normally
worker: locking mutex
To reiterate, I'm looking for a stdlib solution in C 11 that doesn't depend on sleeping to avoid races. I have often ported firmware from Windows to embedded platforms, and often faced similar issues. Windows is the Cadillac of synchronization APIs, with a tool for every occasion. Even twenty years later I still see posts asking how to replace WaitForMultipleObjects. Anyway I expected this to be a PITA and I'm not disappointed. :)
CodePudding user response:
I very well may be missing something, and had to adapt your code a bit as I am doing this quickly on lunch, but I believe this is basically what you are looking for and as far as I can tell is the correct way to go about this. Note the addition of a lambda to the wait()
call and the lock_guard in ThreadFunc()
:
#include "thread"
#include "chrono"
#include "condition_variable"
using namespace std;
condition_variable cvWorkerDone;
mutex mtxWorkerDone;
bool finished = false;
void ThreadFunc()
{
printf("worker: launched; working\n");
this_thread::sleep_for(std::chrono::milliseconds(1000)); // pretend to work
printf("worker: signal done\n");
std::lock_guard<std::mutex> lk(mtxWorkerDone);
finished = true;
cvWorkerDone.notify_all(); // signal main thread that we're done working
printf("worker: exit\n");
}
int main(int argc, char* argv[])
{
printf("main: launch worker\n");
thread worker(ThreadFunc);
printf("main: delay a bit\n");
this_thread::sleep_for(std::chrono::milliseconds(2000)); // causes timeout because we waited too late and missed worker's notification
printf("main: wait for worker's done signal\n");
{
unique_lock<mutex> lk(mtxWorkerDone);
if (cvWorkerDone.wait_for(lk, std::chrono::milliseconds(5000), []{return finished;})) {
printf("main: worker finished normally\n");
} else {
printf("main: worker timeout or error\n");
}
}
worker.join();
return 0;
}