Is there a way that we can write some code to produce a "spurious failures" for the "weak" version of compare_exchange? While the same code should work well for compare_exchange_strong?
I wish to see the real difference between the "weak" and "strong" version of 2 apis, any sample for it?
Thanks a lot!
CodePudding user response:
You'll need a non-x86 CPU, typically one that uses load-linked / store-conditional like ARM or AArch64 (before ARMv8.1). x86 lock cmpxchg
implements CAS_strong, and there's nothing you can do to weaken it and make it possibly fail when the value in memory does match.
CAS_weak can compile to a single LL/SC attempt; CAS_strong has to compile to an LL/SC retry loop to only fail if the compare was false, not merely from competition for the cache line from other threads. (As on Godbolt, for ARMv8 ldxr
/stxr
vs. ARMv8.1 cas
)
Have one thread use CAS_weak to do 100 million increments of a std::atomic variable. (https://preshing.com/20150402/you-can-do-any-kind-of-atomic-read-modify-write-operation/ explains how to use a CAS retry loop to implement any arbitrary atomic RMW operation on one variable, but since this is the only thread writing, CAS_strong will always succeed and we don't need a retry loop. My Godbolt example shows doing a single increment attempt using CAS_weak vs. CAS_strong.)
To cause spurious failures for CAS_weak but not CAS_strong, have another thread reading the same variable, or writing something else in the same cache line.
Or maybe doing something like var.fetch_or(0, std::memory_order_relaxed)
to truly get ownership of the cache line, but in a way that wouldn't affect the value. That will definitely cause spurious LL/SC failures.
When you're done, 100M CAS_strong attempts will have incremented the variable from 0 to 100000000. But 100M CAS_weak attempts will have incremented to some lower value, with the difference being the number of failures.