Please consider the following synchronization problem:
initially:
version = 0 // atomic variable
data = 0 // normal variable (there could be many)
Thread A:
version
data = 3
Thread B:
d = data
v = version
assert(d != 3 || v == 1)
Basically, if thread B sees data = 3
then it must also see version
.
What's the weakest memory order and synchronization we must impose so that the assertion in thread B is always satisfied?
If I understand C memory_order correctly, the release-acquire ordering won't do because that guarantees that operations BEFORE version
, in thread A, will be seen by the operations AFTER v = version
, in thread B.
Acquire and release fences also work in the same directions, but are more general.
As I said, I need the other direction: B sees data = 3
implies B sees version = 1
.
I'm using this "versioning approach" to avoid locks as much as possible in a data structure I'm designing. When I see something has changed, I step back, read the new version
and try again.
I'm trying to keep my code as portable as possible, but I'm targeting x86-64 CPUs.
CodePudding user response:
You might be looking for a SeqLock. You can use the sequence counter as the version number. (version = tmp_counter >> 1
since you need two increments per write of the payload to let readers detect tearing when reading the non-atomic data
. And to make sure they see the data
that goes with this sequence number. Make sure you don't read the atomic counter a 3rd time; use the local tmp that you read it into to verify match before/after copying data
.)
Readers will have to retry if they happen to attempt a read while data
is being modified. But it's non-atomic, so there's no way if thread B sees data = 3
can ever be part of what creates synchronization; it can only be something you see as a result of synchronizing with a version number from the writer.
See:
Implementing 64 bit atomic counter with 32 bit atomics - my attempt at a SeqLock in C , with lots of comments. It's a bit of a hack because ISO C 's data-race UB rules are overly strict; a SeqLock relies on detecting possible tearing and not using torn data, rather than avoiding concurrent access entirely. That's fine on a machine without hardware race detection so that doesn't fault (like all real CPUs), but C still calls that UB, even with
volatile
(although that puts it more into implementation-defined territory). In practice it's fine.GCC reordering up across load with `memory_order_seq_cst`. Is this allowed? - A GCC bug fixed in 8.1 that could break a seqlock implementation.