Are unaligned writes safe, to an immediate operand in machine code while that code is executing?-CodePudding

Let's say I have x86-64 code that looks like this (though this question applies more generally to all code):

mov rbx,7F0140E5247Dh
jmp rbx

Is it safe to overwrite the target constant if that target value is not aligned, while that code could be executing? In other words could I observe a partially updated jump target, resulting in a jumping to non existent addresses? Additionally is this safe if the target constant crosses pages or cache line boundaries?

Edit:

I'm only interested in changing single instructions and not changing the instruction boundary locations.

CodePudding user response：

Only if the write is atomic, which is guaranteed with unaligned qword writes on Intel as long as it doesn't span a cache-line boundary, but not guaranteed on AMD. The lowest-common-denominator atomicity guarantee is that 8-byte aligned stores are atomic, no more than that.

Use an xchg to do a guaranteed-atomic RMW. That will be very slow if the constant itself crosses a cache-line boundary, but correct I believe. (Bus lock, not just a cache lock; so slow there's a perf counter even just for split-lock, and even a CPU feature to make that fault at least in kernel code so you can find instances of it in a VM.) And if the constant doesn't span a problematic boundary for whatever CPU this is, it should be as fast as an aligned atomic operation.

Or if your CPU supports AVX, 16-byte aligned SSE/AVX stores are guaranteed atomic on CPUs with AVX. (Only recently documented after years of this being known to be basically safe in practice, but fortunately it's retroactive to all AVX CPUs, no new feature-bit.) So if you can get your constant to line up to not span a 16-byte boundary, you can update it that way. (Overwriting the surrounding bytes with themselves can't cause a problem, unless another thread is also doing updates of another constant very nearby.)

If performance matters for this (e.g. doing it more than once a minute or so), probably worthwhile to use some padding or a NOP to get the constant 8-byte aligned, especially if you can just lengthen earlier instructions to not need an actual NOP, or even the mov r64,imm64 itself. (Although it's 10 bytes and the max length for an instruction is 15.)

This does not fully generalize to replacing multiple instructions

In other cases where you might be rewriting a sequence of instructions with one with instruction boundaries in different places, that would be a different story. You say the question applies "more generally", but only to replacing an immediate or replacing a whole 4-byte or 8-byte instruction with one of the same length. If another thread could be sleeping or running with RIP inside the region you're writing, you have to consider the case of code-fetch from any possible RIP from the old sequence, after the update. So as I said, changing instruction boundaries is problematic.

But if you respect that limitation, cross-modifying code is AFAIK safe. I think Windows hot-patching quiesces other threads that might be running code, but I don't know why since it already makes sure there's a single large-enough instruction for it to overwrite. Either they're over-cautious, or there's some risk I'm not aware of with code-fetch not respecting store atomicity. Maybe it's just that they don't want to depend on 2-byte store atomicity in case of unaligned functions, even thought that's the default for separate reasons with normal compiler settings.