I am trying to learn assembly lately. A friend of mine showed me the following instructions :
mov EAX, [EDX]
xor EAX, 2
mov [EBX], EAX
inc EDX
inc EBX
To which I replied,
1. Value at `edx` address is loaded in `eax`
2. Value in `eax` is XORed by 2
3. Value in `eax` is stored back in memory address mentioned in `ebx`
Increment edx and ebx.
Is my explanation correct ? Please correct me if I am wrong.
Secondly he asked, where is this used the most ?
Is this some kind of regular code routine instructions? Can anyone please help me where is this used the most ?
Please pardon my understanding/mistakes. I am very new to assembly.
CodePudding user response:
This looks pointlessly inefficient and weird (with overlapping 4-byte loads/stores if this is in a loop), so I doubt you'd find it anywhere, but yes you've correctly described what it does.
It's also not normal to copy a buffer and just flip the 2nd bit of every byte (assuming EDX and EBX point to non-overlapping buffers).
But if you were going to do that, you'd either go a byte at a time with movzx eax, byte [edx]
loads and mov [ebx], al
stores, or you'd add ebx,4
/add edx,4
so you could xor eax, 0x02020202
. (Or better use SSE1 or SSE2 to copy and XOR 16 bytes at a time.)
With a more complicated XOR constant, this gets used to lightly obfuscate strings or other data from simple low-effort reverse engineering like strings ./a.out
to find contiguous ASCII bytes in an executable.
See for example glibc memfrob(3)
which XORs with 42
(0x2a
), in-place instead of copying. As you can tell from the constant which doesn't have its high bit set (so most printable ASCII characters stay printable ASCII), this function is at least half a joke. People will still find your strings, they're just not immediately readable.