I have a question about the definition of the synchronises-with relation in the C memory model when relaxed and acquire/release accesses are mixed on one and the same atomic variable. Consider the following example consisting of a global initialiser and three threads:
int x = 0;
std::atomic<int> atm(0);
[thread T1]
x = 42;
atm.store(1, std::memory_order_release);
[thread T2]
if (atm.load(std::memory_order_relaxed) == 1)
atm.store(2, std::memory_order_relaxed);
[thread T3]
int value = atm.load(std::memory_order_acquire);
assert(value != 1 || x == 42); // Hopefully this is guaranteed to hold.
assert(value != 2 || x == 42); // Does this assert hold necessarily??
My question is whether the second assert in T3
can fail under the C memory model. Note that the answer to this SO question suggests that the assert could not fail if T2
used load/acquire and store/release; please correct me if I got this wrong. However, as stated above, the answer seems to depend on how exactly the synchronises-with relation is defined in this case. I was confused by the text on cppreference, and I came up with the following two possible readings.
The second assert fails. The store to
atm
inT1
could be conceptually understood as storing1_release
where_release
is annotation specifying how the value was stored; along the same lines, the store inT2
could be understood as storing2_relaxed
. Hence, if the load inT3
returns2
, the thread actually read2_relaxed
; thus, the load inT3
does not synchronise-with the store inT1
and there is no guarantee thatT3
seesx == 42
. However, if the load inT3
returns1
, then1_release
was read, and therefore the load inT3
synchronises-with the store inT1
andT3
is guaranteed to seex == 42
.The second assert success. If the load in
T3
returns2
, then this load reads a side-effect of the relaxed store inT2
; however, this store ofT2
is present in the modification order ofatm
only if the modification order ofatm
contains a preceding store with a release semantics. Therefore, the load/acquire inT3
synchronises-with the store/release ofT1
because the latter necessarily precedes the former in the modification order ofatm
.
At first glance, the answer to this SO question seems to suggest that my reading 1 is correct. However, that answer seems to be different in a subtle way: all stores in the answer are release, and the crux of the question is to see that load/acquire and store/release establishes synchronises-with between a pair of threads. In contrast, my question is about how exactly synchronises-with is defined when memory orders are heterogeneous.
I actually hope that reading 2 is correct since this would make reasoning about concurrency easier. Thread T2
does not read or write any memory other than atm
; therefore, T2
itself has no synchronisation requirements and should therefore be able to use relaxed memory order. In contrast, T1
publishes x
and T3
consumes it -- that is, these two threads communicate with each other so they should clearly use acquire/release semantics. In other words, if interpretation 1 turns out to be correct, then the code T2
cannot be written by thinking only about what T2
does; rather, the code of T2
needs to know that it should not "disturb" synchronisation between T1
and T3
.
In any case, knowing what exactly is sanctioned by the standard in this case seems absolutely crucial to me.
CodePudding user response:
Because you use relaxed ordering on a separate load & store in T2, the release sequence is broken and the second assert can trigger (although not on a TSO platform such as X86).
You can fix this by either using acq/rel ordering in thread T2 (as you suggested) or by modifying T2 to use an atomic read-modify-write operation (RMW), like this:
[Thread T2]
int ret;
do {
int val = 1;
ret = atm.compare_exchange_weak(val, 2, std::memory_order_relaxed);
} while (ret != 0);
The modification order of atm
is 0-1-2 and T3 will pick up on either 1 or 2 and no assert can fail.
Another valid implementation of T2 is:
[thread T2]
if (atm.load(std::memory_order_relaxed) == 1)
{
atm.exchange(2, std::memory_order_relaxed);
}
Here the RMW itself is unconditional and it must be accompanied by an if-statement & (relaxed) load to ensure that the modification order of atm
is 0-1 or 0-1-2
Without the if-statement, the modification order could be 0-2 which can cause the assert to fail. (This works because we know there is only one other write in the whole rest of the program. Separate if()
/ exchange
is of course not in general equivalent to compare_exchange_strong
.)
In the C standard, the following quotes are related:
[intro.races]
A release sequence headed by a release operation A on an atomic object M is a maximal contiguous subsequence of side effects in the modification order of M, where the first operation is A, and every subsequent operation is an atomic read-modify-write operation.
[atomics.order]
An atomic operation A that performs a release operation on an atomic object M synchronizes with an atomic operation B that performs an acquire operation on M and takes its value from any side effect in the release sequence headed by A.
this question is about why an RMW works in a release sequence.