In x86, when I have two registers, and I know both of them have only one bit turned on, and I want to know whether they're equal, I can use either test
or cmp
(cmp a, b
will give zero when they're equal, test a, b
will give zero when they're not equal).
Questions like In x86 what's difference between "test eax,eax" and "cmp eax,0" or Test whether a register is zero with CMP reg,0 vs OR reg,reg? say that when comparing to zero it is preferred to use test
over cmp
. Does this advice stay when comparing two registers? Or perhaps the fact that one needs zero and the other needs not-zero affects somehow?
I'm mainly interested in 64-bit registers comparison with 64 bits processor, but if there's a difference with 32 bits I would like to hear too. Mostly important are latest Alder Lake and Zen 3, but other processors can be interesting too.
CodePudding user response:
In the scenario you described, both instructions perform identically on recent microarchitectures. On Alder Lake P, both can run on ports 0, 1, 5, 6, and 11 with a reciprocal throughput of 0.2 (0.25 and slightly less ports on Alder Lake E), while on Zen 3, both run on 4 ports with a reciprocal throughput of 0.25. The latency is 1 in both cases.
As for macro fusion, both instructions fuse with je
and jne
, which is the one you are interested in.
So really, in this case in particular it does not make a difference. There may be a difference in other use cases, e.g. when immediates or other conditions are involved.