is a matching __sync_lock_release mandatory?-CodePudding

I am reviewing some circa 2013 C code that uses __sync_xxx built-ins. Mostly these look fine but in one case I found a __sync_lock_test_and_set() without __sync_lock_release(). It looks like the original programmer was using __sync_lock_test_and_set() as an atomic store for pointer values in a hash table. The code doesn't wait on these memory locations (e.g. check the __sync_lock_test_and_set result in a while loop), it just sets them and moves on. Searching on "is __sync_lock_release mandatory" and similar isn't bringing up a clear answer (although that does bring up useful discussions [1][2] making clear the newer GNU atomic built-ins should be used, and I may be able to convince the client to do that).

What problem might be caused by not calling __sync_lock_release() and releasing an acquire memory barrier? Do such barriers "survive" in any way and possibly affect other code?

[1] What is the GCC builtin for an atomic set? [2] Is my spin lock implementation correct and optimal?

CodePudding user response：

Memory barriers (or memory orders for operations) aren't things you have to release in the first place. See https://preshing.com/20120913/acquire-and-release-semantics/ / https://preshing.com/20130922/acquire-and-release-fences

release is one of the possible orderings an operation can have, one that's appropriate for releasing a lock but also for the writing side of acquire/release synchronization.

An acquire operation is one you could use in implementing a function that takes a lock. There isn't a hidden lock anywhere that gets taken behind the scenes. (A non-lock-free implementation of __sync or __atomic builtins might use a lock internally, but it would take and release it within one invocation of the builtin.)

void __sync_lock_release (ptr, 0) is just __atomic_store_explicit(ptr, 0, __ATOMIC_RELEASE). (Actually the docs call it a "barrier", not just an operation with ordering wrt. other operations, so perhaps atomic_thread_fence(memory_order_release) and then a relaxed atomic store.

The documentation wording of "This built-in function releases the lock acquired by __sync_lock_test_and_set." is just an example use-case, there is no resource or lock that "needs to be released".

All legacy __sync builtins are like __atomic_...(__ATOMIC_SEQ_CST), except apparently for __sync_lock_test_and_set and __sync_lock_release; this is the only reason there's a special function with release in the name, because the old sync builtins didn't generally allow release operations.

If the code only wants to do atomic_exchange_explicit(ptr, 1, memory_order_acquire) at that point, that's totally fine and normal, and it can do it with an obsolete __sync_lock_test_and_set if it wants.

If you were going to rewrite anything, use C11 stdatomic.h, or use __atomic builtins if you don't want to make the array _Atomic. Either way, figure out if acquire ordering for that operation is appropriate, and if not use a weaker or stronger order, whatever is required. On x86, any atomic RMW will effectively be seq_cst except for possible compile-time reordering of weaker orders.