Home > database >  AND vs SUB when converting lowercase to uppercase assembly
AND vs SUB when converting lowercase to uppercase assembly

Time:01-13

I was wondering why you would use the and instruction instead of the sub instruction when converting lowercase ASCII characters to uppercase ones.

mov dx, ‘a’
sub dx, 32

Vs

mov dx, ‘a’
and dx, 11011111b

CodePudding user response:

Either one is acceptable, it's just a matter of preference. I like to use and myself. Shouldn't matter as long as you've checked to make sure your character is between 'a' and 'z' first.

CodePudding user response:

There's no performance or correctness difference if you already know the input is a lower-case alphabetic character. and has the advantage when you know it's alphabetic but it might already be upper-case, since it leaves upper-case letters unmodified. (Or as part of detecting alphabetic and normalizing to one case, either with and with ~0x20 or or with 0x20, as in What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa?)


If the next instruction is a jcc like jnz, sub and and are equally able to macro-fuse with it into a single uop on Intel Sandybridge-family CPUs, so no advantage there.

If using it in a loop over a zero-terminated C string, you might be doing something like movzx edx, byte [rdi] / and edx, ~0x20 / jnz .loop at the bottom of a loop, since all alphabetic characters have non-zero bits other than the lower-case bit. (0x20 is ASCII space).

Using sub in that case lets you exit a loop on any character less than space, i.e. control characters, tabs, or newline. sub edx, 0x20 / ja .loop, or jae .loop to keep looping even on a space (but still not tab or newline).

  • Related