memory barrier usage with CAS operations-CodePudding

In the code snippet from CPP reference, the memory barriers std::memory_order_release and std::memory_order_relaxed are used for the success and failure cases respectively. When is it OK to use std::memory_order_release for both or std::memory_order_relaxed for both?

template<class T>
struct node
{
    T data;
    node* next;
    node(const T& data) : data(data), next(nullptr) {}
};
 
template<class T>
class stack
{
    std::atomic<node<T>*> head;
 public:
    void push(const T& data)
    {
        node<T>* new_node = new node<T>(data);
 
        // put the current value of head into new_node->next
        new_node->next = head.load(std::memory_order_relaxed);
 
        // now make new_node the new head, but if the head
        // is no longer what's stored in new_node->next
        // (some other thread must have inserted a node just now)
        // then put that new head into new_node->next and try again
        while(!std::atomic_compare_exchange_weak_explicit(
                                &head,
                                &new_node->next,
                                new_node,
                                std::memory_order_release,
                                std::memory_order_relaxed))
                ; // the body of the loop is empty
// note: the above loop is not thread-safe in at least
// GCC prior to 4.8.3 (bug 60272), clang prior to 2014-05-05 (bug 18899)
// MSVC prior to 2014-03-17 (bug 819819). See member function version for workaround
    }
};

CodePudding user response：

Using relaxed for both would not be safe. If the compare_exchange succeeds, then head is updated with the value of new_node, and other threads reading head will get that pointer. However, without release ordering, the value written to new_node->next (now head->next) may not be globally visible yet, so if the other thread tries to read head->next it may see garbage, or misbehave in other ways.

Formally, the write to new_node->next needs to happen before any other thread tries to read it, which can only be ensured by having release ordering on the store that signals other threads that the value is ready. (Likewise, the thread that reads head needs to use acquire ordering.) With relaxed ordering on the success store, the happens-before relationship is not there, so the code has a data race and its behavior is undefined.

Using release for both would not make sense, because release ordering only makes sense for stores, and in the failure case, no store is performed. In fact, for this reason, passing std::memory_order_release for the failure ordering is actually illegal; this is stated on the page where you got the sample code from. Using acquire or seq_cst would be safe (stronger ordering is always safe) but unnecessary, and might cause a needless performance hit.