AMD's mwaitx
instruction allows you to wait for an address to change, but it has a limited duration. There's no way to tell if it woke up because the value changed or because of an interrupt.
You can always inspect the address to see if it's changed, but this leads to the ABA problem where it could've changed, then changed back.
This can lead to an issue where you might want to send an update request for a data block if it's associated lock was accessed, but if the lock is acquired, used, then released, the data structure changed but the value of the lock doesn't APPEAR to have changed, and the thread using mwaitx
isn't aware that the lock was accessed.
Is there any workaround for this or am I stuck?
CodePudding user response:
The idea is to use mwaitx
as an optimisation, not as your sole synchronisation primitive. For this use case, it doesn't matter if it could sporadically fail.
Say for example you want to assert a spin lock
mutex dd 0
by setting it to 1
if it held 0
before. The simple way to do this is to busily wait for the lock to become zero, e.g. like this:
again: mov ebx, 1
xchg [mutex], ebx ; try to claim mutex
test ebx, ebx ; did we succeed?
jnz again ; if not, try again
This loop is of course quite inefficient: if the lock is held by another thread, it spins very quickly, causing a lot of expensive RMW bus accesses. We can reduce the load using a pause
instruction, but what would be even better was that once we knew the lock is held, we would only try to claim it once we knew that the other thread had released the lock.
The monitor
/ mwait
instructions provide a facility to do this: you set up an address to be monitored and then get noticed when something interesting happened. Only then do you try to claim the lock, saving you from spinning if you know that you won't get it. It doesn't hurt if mwait
returns early: you'll just fail to claim the lock and then go right back to waiting for it.
again: mov ebx, 1
xchg [mutex], ebx ; try to claim mutex
test ebx, ebx ; did we succeed?
jz gotit
lea rax, [mutex] ; address to monitor
xor ecx, ecx ; no extensions
xor edx, edx ; no hints
monitor ; start to monitor the mutex
xor ecx, ecx ; no extensions
xor eax, eax ; no hints
mwait ; wait for mutex to change
jmp again ; once it changed, try to get the lock again
gotit: ...
While this is good and all, there's a new problem: if the other thread doesn't yield for a while, sophisticated mutex implementations might want to switch to a different implementation, e.g. one where the kernel takes care of the lock. This permits waiting for the lock to become available without the thread having to actually run, freeing up resources for other users.
This is hard to achieve with monitor
and mwait
: the waiting period is indefinite and could be very long. AMD's mwaitx
and monitorx
instructions are very similar but fix this problem: they permit you to set a timeout after which mwaitx
returns even if the memory region did not change. This way, you can use an algorithm like ”try to claim the lock 10 times by spinning and waiting, then escalate to a kernel-based lock” and be reasonably sure of the time frame it takes to execute.