In the code snippet from CPP reference, the memory barriers std::memory_order_release
and std::memory_order_relaxed
are used for the success and failure cases respectively. When is it OK to use std::memory_order_release
for both or std::memory_order_relaxed
for both?
template<class T>
struct node
{
T data;
node* next;
node(const T& data) : data(data), next(nullptr) {}
};
template<class T>
class stack
{
std::atomic<node<T>*> head;
public:
void push(const T& data)
{
node<T>* new_node = new node<T>(data);
// put the current value of head into new_node->next
new_node->next = head.load(std::memory_order_relaxed);
// now make new_node the new head, but if the head
// is no longer what's stored in new_node->next
// (some other thread must have inserted a node just now)
// then put that new head into new_node->next and try again
while(!std::atomic_compare_exchange_weak_explicit(
&head,
&new_node->next,
new_node,
std::memory_order_release,
std::memory_order_relaxed))
; // the body of the loop is empty
// note: the above loop is not thread-safe in at least
// GCC prior to 4.8.3 (bug 60272), clang prior to 2014-05-05 (bug 18899)
// MSVC prior to 2014-03-17 (bug 819819). See member function version for workaround
}
};
CodePudding user response:
Using relaxed
for both would not be safe. If the compare_exchange
succeeds, then head
is updated with the value of new_node
, and other threads reading head
will get that pointer. However, without release ordering, the value written to new_node->next
(now head->next
) may not be globally visible yet, so if the other thread tries to read head->next
it may see garbage, or misbehave in other ways.
Formally, the write to new_node->next
needs to happen before any other thread tries to read it, which can only be ensured by having release ordering on the store that signals other threads that the value is ready. (Likewise, the thread that reads head
needs to use acquire ordering.) With relaxed ordering on the success store, the happens-before relationship is not there, so the code has a data race and its behavior is undefined.
Using release
for both would not make sense, because release ordering only makes sense for stores, and in the failure case, no store is performed. In fact, for this reason, passing std::memory_order_release
for the failure ordering is actually illegal; this is stated on the page where you got the sample code from. Using acquire
or seq_cst
would be safe (stronger ordering is always safe) but unnecessary, and might cause a needless performance hit.