I have a piece of multithreaded that I'm not sure is not liable to a data race because of compiler reordering.
Here is a minimal example:
int main()
{
int x = 0;
x = 5;
auto t = std::thread([&x]()
{
x;
});
t.join();
return 0;
}
Is the assignment of x = 5 guaranteed to be before the thread start?
CodePudding user response:
Short answer: The code will work as expected. No reordering will take place
Long answer:
Compile time reordering
Let's consider what's going on.
- You put a variable in automatic storage (x)
- You create an object that holds a reference to this variable (the lambda)
- You pass that object to an external function (the thread constructor)
The compiler does escape analysis during optimization. Due to this sequence of events, the variable x has escaped once point 3 is reached. Which means from the compiler's point of view, any external function (except those marked as pure) may read or modify the variable. Therefore its value has to be stored to the stack before each function call and has to be loaded from stack after the function.
You did not make x an atomic variable. So the compiler is free to ignore any potential multithreading effects. Therefore the value may not be reloaded multiple times from memory in between calls to external functions. It may still be reloaded if the compiler decides to not keep the value in a register in between uses.
Let's annotate and expand your source code to show it:
int main()
{
int x = 0;
x = 5; // stores on stack for use by external function in next line
auto t = std::thread([&x]() mutable
{
x;
});
int x1 = x; // loads x from stack after thread constructor may (in theory) have modified it
int x2 = x; // probably no reload because not an atomic variable
x = 7; // new value stored on stack because join() could access it (in theory)
t.join();
int x3 = x; // reload from stack because join() could have changed it
return 0;
}
Again, this has nothing to do with multithreading. Escape analysis and external function calls are sufficient.
Any access from main()
between thread creation and joining would also be undefined behavior because it would be a data-race on a non-atomic variable. But that's just a side-note.
This takes care of the compiler behavior. But what about the CPU? May it reorder instructions?
Run time reordering
For this, we can look at the C standard Section 32.4.2.2 [thread.thread.constr] clause 7:
Synchronization: The completion of the invocation of the constructor synchronizes with the beginning of the invocation of the copy of
f
.
The constructor means the thread constructor. f
is the thread function, meaning the lambda in your case. So this means that any memory effects are synchronized properly.
The join()
call also synchronizes. Therefore access to x after the join can not suffer from runtime-reordering.
The completion of the thread represented by *this synchronizes with (6.9.2) the corresponding successful join() return.
Side note
Unlike suggested in some comments, the compiler will not optimize the thread creation away for two reasons: 1. No compiler is sufficiently magical to figure this out. 2. The thread creation may fail, which is defined behavior. Therefore it has to be included in the runtime.