Synchronizing global reference-counted resource with atomics -- is relaxed appropriate here?-CodePudding

I have a global reference-counted object obj that I want to protect from data races by using atomic operations:

T* obj;                  // initially nullptr
std::atomic<int> count;  // initially zero

My understanding is that I need to use std::memory_order_release after I write to obj, so that the other threads will be aware of it being created:

void increment()
{
    if (count.load(std::memory_order_relaxed) == 0)
        obj = std::make_unique<T>();

    count.fetch_add(1, std::memory_order_release);
}

Likewise, I need to use std::memory_order_acquire when reading the counter, to ensure the thread has visibility of obj being changed:

void decrement()
{
    count.fetch_sub(1, std::memory_order_relaxed);

    if (count.load(std::memory_order_acquire) == 0)
        obj.reset();
}

I am not convinced that the code above is correct, but I'm not entirely sure why. I feel like after obj.reset() is called, there should be a std::memory_order_release operation to inform other threads about it. Is that correct?

Are there other things that can go wrong, or is my understanding of atomic operations in this case completely wrong?

CodePudding user response：

It is wrong regardless of memory ordering.

As @MaartenBamelis pointed out for concurrent calling of increment the object is constructed twice. And the same is true for concurrent decrement: object is reset twice (which may result in double destructor call).

Note that there's disagreement between T* obj; declaration and using it as unique_ptr but neither raw pointer not unique pointer are safe for concurrent modification. In practice, reset or delete will check pointer for null, then delete and set it to null, and these steps are not atomic.

fetch_add and fetch_sub are fetch and op instead of just op for a reason: if you don't use the value observed during operation, it is likely to be a race.

CodePudding user response：

This code is inherently racey. If two threads call increment at the same time when count is initially 0, both will see count as 0, and both will create obj (and race to see which copy is kept; given unique_ptr has no special threading protections, terrible things can happen if two of them set it at once).

If two threads decrement at the same time (holding the last two references), and finish the fetch_sub before either calls load, both will reset obj (also bad).

And if a decrement finishes the fetch_sub (to 0), then another thread increments before the decrement load occurs, the increment will see the count as 0 and reinitialize. Whether the object is cleared after being replaced, or replaced after being cleared, or some horrible mixture of the two, will depend on whether increment's fetch_add runs before or after decrement's load.

In short: If you find yourself using two separate atomic operations on the same variable, and testing the result of one of them (without looping, as in a compare and swap loop), you're wrong.

More correct code would look like:

void increment()
{
    if (count.fetch_add(1, std::memory_order_release) == 0)
        obj = std::make_unique<T>();
}

void decrement()
{
    if (count.fetch_sub(1, std::memory_order_acquire) == 1)
        obj.reset();
}

but even then it's not reliable; there's no guarantee that, when count is 0, two threads couldn't call increment, both of them fetch_add at once, and while exactly one of them is guaranteed to see the count as 0, said 0-seeing thread might end up delayed while the one that saw it as 1 assumes the object exists and uses it before it's initialized.

I'm not going to swear there's no mutex-free solution here, but dealing with the issues involved with atomics is almost certainly not worth the headache.

CodePudding user response：

I don't really see a simple way of implementing a reference-counted resource with atomics. Maybe there's some clever way that I haven't thought of yet, but in my experience, clever does not equal readable.

My advice would be to implement it first using a mutex. Then you simply lock the mutex, check the reference count, do whatever needs to be done, and unlock again. It's guaranteed correct:

std::mutex mutex;
int count;
std::unique_ptr<T> obj;

void increment()
{
    auto lock = std::unique_lock{mutex};
    if (  count == 1) // Am I the first reference?
        obj = std::make_unique<T>();
}

void decrement()
{
    auto lock = std::unique_lock{mutex};
    if (--count == 0) // Was I the last reference?
        obj.reset();
}

Although at this point, I would just use a std::shared_ptr instead of managing the reference count myself:

std::mutex mutex;
std::weak_ptr<T> obj;

std::shared_ptr<T> acquire()
{
    auto lock = std::unique_lock{mutex};
    auto sp = obj.lock();
    if (!sp)
        obj = sp = std::make_shared<T>();
    return sp;
}

I believe this also makes it safe when exceptions may be thrown when constructing the object.

Mutexes are surprisingly performant, so I expect that locking code is plenty quick unless you have a highly specialized use case where you need code to be lock-free.