Why is atomic bool needed to avoid data race?-CodePudding

I was looking at list 5.13 in C Concurrency in Action by Antony Williams:, and I am confused by the comments "the store to and load from y still have to be atomic; otherwise, there would be a data race on y". That implies that if y is a normal (non-atomic) bool then the assert may fire, but why?

#include <atomic>
#include <thread>
#include <assert.h>
bool x=false; 
std::atomic<bool> y;
std::atomic<int> z;

void write_x_then_y()
{
 x=true; 
 std::atomic_thread_fence(std::memory_order_release);
 y.store(true,std::memory_order_relaxed); 
}

void read_y_then_x()
{
 while(!y.load(std::memory_order_relaxed)); 
 std::atomic_thread_fence(std::memory_order_acquire);
 if(x)   z;
}

int main()
{
 x=false;
 y=false;
 z=0;
 std::thread a(write_x_then_y);
 std::thread b(read_y_then_x);
 a.join();
 b.join();
 assert(z.load()!=0); 
}

Now let's change y to a normal bool, and I want to understand why the assert can fire.

#include <atomic>
#include <thread>
#include <assert.h>
bool x=false; 
bool y=false;
std::atomic<int> z;

void write_x_then_y()
{
 x=true; 
 std::atomic_thread_fence(std::memory_order_release);
 y=true;
}

void read_y_then_x()
{
 while(!y); 
 std::atomic_thread_fence(std::memory_order_acquire);
 if(x)   z;
}

int main()
{
 x=false;
 y=false;
 z=0;
 std::thread a(write_x_then_y);
 std::thread b(read_y_then_x);
 a.join();
 b.join();
 assert(z.load()!=0); 
}

I understand that a data race happens on non-atomic global variables, but in this example if the while loop in read_y_then_x exits, my understanding is that y must either already be set to true, or in the process of being set to true (because it is a non-atomic operation) in the write_x_then_y thread. Since atomic_thread_fence in the write_x_then_y thread makes sure no code written above that can be reordered after, I think the x=true operation must have been finished. In addition, the std::memory_order_release and std::memory_order_acquire tags in two threads make sure that the updated value of x has been synchronized-with the read_y_then_x thread when reading x, so I feel the assert still holds... What am I missing?

CodePudding user response：

Accessing a non-atomic object in two threads unsynchronized with one of the accesses being a write access is always a data race and causes undefined behavior. This is how the term "data race" is formally defined in the C language and what it prescribes as its consequences. It is not merely a race condition which informally refers to multiple possible outcomes being allowed due to unspecified ordering of certain thread accesses.

The write in y=true; happens while the loop while(!y); is still reading y, which makes it a data race if y is non-atomic. The program would have undefined behavior, which doesn't just mean that the assert might fire. It means that the program may do anything, e.g. crash or freeze up.

The compiler is allowed to optimize under the assumption that this never happens and thereby optimize the code in such a way that your intended behavior is not preserved since it relies on the access causing the data race.

Furthermore, an infinite loop which doesn't eventually perform any atomic/synchronizing/volatile/IO operation also has undefined behavior. So while(!y); has undefined behavior if y is not an atomic and initially false and the compiler can assume that this line is unreachable under those conditions.

The compiler could for example remove the loop from the function for that reason, as actually does happen with current compilers, see comments under the question.

And I am also aware that especially Clang does perform optimization based on that and sometimes even goes so far as to completely drop all contents (including the ret instruction at the end!) from an emitted function with such an infinite loop, if it could not ever be called without undefined behavior. However here, because y might be true when the function is called, in which case there is no undefined behavior for that, this doesn't happen.

All of this is on the language level. It doesn't matter what would happen on the hardware level if the program was compiled in a most literal translation. These would be additional concerns, e.g. potential tearing of write access and potential cache incoherency between threads, but both of these are unlikely to be a problem on common platforms for a bool. Another problem might be though that the threads could keep a copy of the variable in a register, potentially never producing a store that the other thread could observe, which is allowed for a non-atomic non-volatile object.

CodePudding user response：

If you write this:

bool y=false;
...
while(!y);

then the compiler can assume y will not change by itself. The body of the while is empty so either y is true at the start and you have an endless loop or y is false at the start and the while ends.

The compiler can optimize this into:

if (!y) while(true);

But c also says that there must always be progress, an infinite loop is UB, so the compiler may do whatever it likes when it sees a while(true);, including removing it. gcc and clang will actually do that as Jerome pointed out here: https://godbolt.org/z/ocrxnee8T

So what the std::atomic<bool> y; does is the modern form of marking y as volatile. The compiler can no longer assume that repeated reads of y give the same result and can no longer optiomize away the while(!y); loop.

Depending on the architecture it will also insert necessary memory barriers so changes to the variable become observable to other threads, which is more than volatile would have done.