I have a global reference-counted object obj
that I want to protect from data races by using atomic operations:
T* obj; // initially nullptr
std::atomic<int> count; // initially zero
My understanding is that I need to use std::memory_order_release
after I write to obj
, so that the other threads will be aware of it being created:
void increment()
{
if (count.load(std::memory_order_relaxed) == 0)
obj = std::make_unique<T>();
count.fetch_add(1, std::memory_order_release);
}
Likewise, I need to use std::memory_order_acquire
when reading the counter, to ensure the thread has visibility of obj
being changed:
void decrement()
{
count.fetch_sub(1, std::memory_order_relaxed);
if (count.load(std::memory_order_acquire) == 0)
obj.reset();
}
I am not convinced that the code above is correct, but I'm not entirely sure why. I feel like after obj.reset()
is called, there should be a std::memory_order_release
operation to inform other threads about it. Is that correct?
Are there other things that can go wrong, or is my understanding of atomic operations in this case completely wrong?
CodePudding user response:
It is wrong regardless of memory ordering.
As @MaartenBamelis pointed out for concurrent calling of increment
the object is constructed twice. And the same is true for concurrent decrement
: object is reset twice (which may result in double destructor call).
Note that there's disagreement between T* obj;
declaration and using it as unique_ptr
but neither raw pointer not unique pointer are safe for concurrent modification. In practice, reset
or delete
will check pointer for null, then delete and set it to null, and these steps are not atomic.
fetch_add
and fetch_sub
are fetch and op instead of just op for a reason: if you don't use the value observed during operation, it is likely to be a race.
CodePudding user response:
This code is inherently racey. If two threads call increment
at the same time when count
is initially 0
, both will see count
as 0
, and both will create obj
(and race to see which copy is kept; given unique_ptr
has no special threading protections, terrible things can happen if two of them set it at once).
If two threads decrement
at the same time (holding the last two references), and finish the fetch_sub
before either calls load
, both will reset
obj
(also bad).
And if a decrement
finishes the fetch_sub
(to 0
), then another thread increment
s before the decrement
load
occurs, the increment
will see the count
as 0
and reinitialize. Whether the object is cleared after being replaced, or replaced after being cleared, or some horrible mixture of the two, will depend on whether increment
's fetch_add
runs before or after decrement
's load
.
In short: If you find yourself using two separate atomic operations on the same variable, and testing the result of one of them (without looping, as in a compare and swap loop), you're wrong.
More correct code would look like:
void increment()
{
if (count.fetch_add(1, std::memory_order_release) == 0)
obj = std::make_unique<T>();
}
void decrement()
{
if (count.fetch_sub(1, std::memory_order_acquire) == 1)
obj.reset();
}
but even then it's not reliable; there's no guarantee that, when count
is 0
, two threads couldn't call increment
, both of them fetch_add
at once, and while exactly one of them is guaranteed to see the count
as 0
, said 0
-seeing thread might end up delayed while the one that saw it as 1
assumes the object exists and uses it before it's initialized.
I'm not going to swear there's no mutex-free solution here, but dealing with the issues involved with atomics is almost certainly not worth the headache.
CodePudding user response:
I don't really see a simple way of implementing a reference-counted resource with atomics. Maybe there's some clever way that I haven't thought of yet, but in my experience, clever does not equal readable.
My advice would be to implement it first using a mutex. Then you simply lock the mutex, check the reference count, do whatever needs to be done, and unlock again. It's guaranteed correct:
std::mutex mutex;
int count;
std::unique_ptr<T> obj;
void increment()
{
auto lock = std::unique_lock{mutex};
if ( count == 1) // Am I the first reference?
obj = std::make_unique<T>();
}
void decrement()
{
auto lock = std::unique_lock{mutex};
if (--count == 0) // Was I the last reference?
obj.reset();
}
Although at this point, I would just use a std::shared_ptr
instead of managing the reference count myself:
std::mutex mutex;
std::weak_ptr<T> obj;
std::shared_ptr<T> acquire()
{
auto lock = std::unique_lock{mutex};
auto sp = obj.lock();
if (!sp)
obj = sp = std::make_shared<T>();
return sp;
}
I believe this also makes it safe when exceptions may be thrown when constructing the object.
Mutexes are surprisingly performant, so I expect that locking code is plenty quick unless you have a highly specialized use case where you need code to be lock-free.