I am a programmer and learning assembly language in order to intuitively understand how my code run on the CPU.
While I was studying the ASM keyword LOCK, google told me CPU will take exclusive ownership of data bus while executing the following instruction with LOCK prefix.
But without any extra information how CPU can take exclusive ownership.
I also found that 8086 microchip has a lock pin which do exactly the same thing as keyword LOCK does. This maybe the logic circuit which implements the LOCK keyword.
Can anyone explain the mechanism of the lock pin. While the lock pin is active, how other CPUs will be rejected when try to acquire the usage of data bus.
CodePudding user response:
If the only CPU has the memory bus locked, no other device can read or change memory contents during that time, not even via DMA. (Or with multiple CPUs on a shared bus with no cache, same deal.) Therefore, no other memory operations at all can happen between the load and the store of a lock add [di], ax
for example, making it atomic wrt. any possible observer. (Other than a logic analyzer connected to the bus, which doesn't count.)
Semi-related: Can num be atomic for 'int num'? describes how the lock
prefix works on modern CPUs for cacheable memory, providing RMW atomicity without a bus lock, just hanging on to the cache line for the duration.
We call this a "cache lock"; all modern CPUs work this way for aligned locked
operations, only doing an expensive bus lock on something like xchg [mem], ax
that spans a boundary between two cache-lines. That hurts throughput on all cores, and is so expensive that modern CPUs have a way to make that always fault, but not other unaligned loads/stores, as well as performance counters for it.
Fun fact: xchg [mem], reg
has implicit lock
semantics on 386 and newer. (Which is unfortunate because it makes it unusable for performance reasons as just a plain load/store when you're running low on registers). It didn't on 286 or earlier, unless you did lock xchg
. This is possibly related to the fact that there were SMP 386 systems (with a primitive sequentially-consistent memory model). The modern x86 memory model applies to 486 and later SMP systems.