can __sync_local_test_and_set/xchg be used to implement a mutex lock?-CodePudding

most search results for __sync_local_test_and_set are mentioning it can be used to implement a spinlock instead of a mutex lock. The article Multi-threaded programming: efficiency of locking also reads: I could not think of a means to implement mutex without closely interacting with the kernel(futex system call) and also gives a example of implementing spinlock with __sync_lock_test_and_set.

But I think __sync_lock_test_and_set can also be used to implement a mutex lock:

if (__sync_lock_test_and_test(&lock, 1) == 0) { 
    // we get the lock, do our work
    lock = 0; // or __sync_lock_release(&lock);
} else {
    // the lock has already been token.
}

Does this work as a mutex lock?

Thanks in advance!

CodePudding user response：

First, the __sync_* builtins are specific to the Linux kernel. In other contexts you would want to use the C11 atomic types and functions from <stdatomic.h> (or C 11 std::atomic from <atomic>).

Your algorithm works, as far as it goes. But the question is what goes in // the lock has already been taken. What do you do in that block? In most cases, the "work" you have to do isn't optional. You can't just skip it; you have to wait and do it as soon as the lock is available. In essence, what you've written is only the trylock and unlock functions of a typical mutex. The harder one is lock.

You want to:

Wait for the mutex without consuming CPU
Wake up and take the mutex as soon as it becomes available, without unnecessary delay
If multiple threads are waiting for the mutex, arbitrate fairly between them, so that no one thread gets starved.

You can sort of accomplish #1 by doing something like

while (atomic_flag_test_and_set(&lock, 1) == 1) {
    thrd_yield(); 
    // or: thrd_sleep(short_time);
}

But you will still have the waiting thread waking up frequently, only to test the mutex, find it's still unavailable, and go back to sleep. This may not consume much CPU time all told, but it does force a lot of unnecessary context switches, which is not free.

This doesn't do so great on #2, because when the mutex becomes available, this thread will not wake up to take it immediately; it will have to wait until it happens to get another time slice (in the case of thrd_yield()) or until its short_time expires. This unnecessarily slows down your program.

And it doesn't accomplish #3 at all. If several threads are waiting on the same mutex, it is basically random which one ends up taking it, and some may end up unfairly waiting much longer than others.

So you really need support from the OS to do lock properly. The OS can create a queue of threads waiting on this particular mutex, and leave them asleep as long as it is unavailable. When another thread unlocks the mutex, the OS can choose the oldest thread in the queue (or apply some other clever priority-based algorithm), wake up that thread and only that thread, and ensure that it does hold the mutex when it awakens.

Without OS support, you can't really do anything except a spinlock or some slightly glorified variant of it.