I am reviewing some circa 2013 C code that uses __sync_xxx built-ins. Mostly these look fine but in one case I found a __sync_lock_test_and_set() without __sync_lock_release(). It looks like the original programmer was using __sync_lock_test_and_set() as an atomic store for pointer values in a hash table. The code doesn't wait on these memory locations (e.g. check the __sync_lock_test_and_set result in a while loop), it just sets them and moves on. Searching on "is __sync_lock_release mandatory" and similar isn't bringing up a clear answer (although that does bring up useful discussions [1][2] making clear the newer GNU atomic built-ins should be used, and I may be able to convince the client to do that).
What problem might be caused by not calling __sync_lock_release() and releasing an acquire memory barrier? Do such barriers "survive" in any way and possibly affect other code?
[1] What is the GCC builtin for an atomic set? [2] Is my spin lock implementation correct and optimal?
CodePudding user response:
Memory barriers (or memory orders for operations) aren't things you have to release in the first place. See https://preshing.com/20120913/acquire-and-release-semantics/ / https://preshing.com/20130922/acquire-and-release-fences
release
is one of the possible orderings an operation can have, one that's appropriate for releasing a lock but also for the writing side of acquire/release synchronization.
An acquire
operation is one you could use in implementing a function that takes a lock. There isn't a hidden lock anywhere that gets taken behind the scenes. (A non-lock-free implementation of __sync
or __atomic
builtins might use a lock internally, but it would take and release it within one invocation of the builtin.)
void __sync_lock_release (ptr, 0)
is just __atomic_store_explicit(ptr, 0, __ATOMIC_RELEASE)
. (Actually the docs call it a "barrier", not just an operation with ordering wrt. other operations, so perhaps atomic_thread_fence(memory_order_release)
and then a relaxed atomic store.
The documentation wording of "This built-in function releases the lock acquired by __sync_lock_test_and_set
." is just an example use-case, there is no resource or lock that "needs to be released".
All legacy __sync
builtins are like __atomic_...(__ATOMIC_SEQ_CST)
, except apparently for __sync_lock_test_and_set
and __sync_lock_release
; this is the only reason there's a special function with release
in the name, because the old sync
builtins didn't generally allow release operations.
If the code only wants to do atomic_exchange_explicit(ptr, 1, memory_order_acquire)
at that point, that's totally fine and normal, and it can do it with an obsolete __sync_lock_test_and_set
if it wants.
If you were going to rewrite anything, use C11 stdatomic.h
, or use __atomic
builtins if you don't want to make the array _Atomic
. Either way, figure out if acquire ordering for that operation is appropriate, and if not use a weaker or stronger order, whatever is required. On x86, any atomic RMW will effectively be seq_cst
except for possible compile-time reordering of weaker orders.