When switching from compatibility mode to 64-bit mode at the same privilege level by a far call, fields such as BASE or LIMIT in segment registers are ignored, and 64-bit pointer registers are all available. However, if the BASE field is ignored, the stack base address will become 0x0 implicitly, which means an implicit stack switch will be raised, am I right? And if my understanding is correct, how does the CPU switch back to the original stack when returning via the ret instruction?
CodePudding user response:
Will an implicit stack switch occur when switching from compatibility mode to 64-bit mode at the same privilege level?
No implicit stack switch will occur when switching from compatibility mode to 64-bit mode at the same privilege level; unless it's done by an interrupt using the IST mechanism.
However, if the BASE field is ignored, the stack base address will become 0x0 implicitly, which means an implicit stack switch will be raised, am I right?
For 80x86; in general a linear address is calculated by adding an offset within a segment to the base of a segment, where the base of the segment is stored in a "hidden" part of the segment register. For example, in 32-bit code, if you do mov eax,[esp]
then the CPU calculates "linear_address = SS.base ESP".
Because most operating systems use a "flat memory model" where segments are effectively disabled (by setting all segments bases to zero and all segment limits to "max"); CPUs optimized this specific case such that if segment bases are known to be zero the addition is skipped (e.g. for that mov eax,[esp]
CPU may cheat and do "linear_address = ESP" if it already knows SS.base is zero).
For 64-bit code, excluding FS and GS segment registers, the value in the "hidden" part of the segment register (used for segment base) is assumed to be zero regardless of whether it is or not; and the CPU knows the segment base is always "assumed to be zero" and always optimizes the address calculation (by not adding segment base).
When switching from compatibility mode to 64-bit mode at the same privilege level; there are 2 possibilities:
a) The segment base for SS was zero in compatibility mode, and becomes "assumed zero" in 64-bit, and therefore the address of the stack doesn't change (see note).
b) The segment base for SS was non-zero in compatibility mode, and becomes "assumed zero" in 64-bit, and therefore the address of the stack does change.
The latter possibility (non-zero SS segment base in compatibility mode) would be something I'd strongly avoid; as it's horribly confusing for programmers, and has a "higher than normal" risk of quirks/errata (e.g. a future CPU doing things in a slightly different order and storing return information at "ss.base esp" instead of at "0 esp").
Note: In 64-bit code, anything that only updates the lower half of a register causes the upper half of the 64-bit register to be zeroed (e.g. loading the value 0x9ABCDEF0 into ESP will cause RSP to be set to 0x000000009ABCDEF0). I don't think this happens in compatibility mode (or at least, I don't think it's guaranteed to happen). This may potentially create a situation where "junk" left in the higher half of RSP before switching to compatibility mode is still present when you switch back to 64-bit (e.g. possibly causing the address of the stack to change from 0x9ABCDEF0 to 0x123456789ABCDEF0).
CodePudding user response:
Note that in x86 terminology, "stack switching" has a specific technical meaning involving loading a new SS:[ER]SP from the TSS, e.g. when user-space runs an int
instruction. This is not what you're talking about, just that ss.base esp
might be different from 0 rsp
.
Yes, I think that's correct, in the unusual case where your 32-bit code had SS.base != 0.
All mainstream OSes use a flat memory model with CS/DS/ES/SS bases all zero (and limit=-1), so this is a non-issue.
BTW, Windows x64 does this user-space mode switch in practice in its DLLs, as part of its WoW64 (i.e. Windows (32-bit) on Windows64). Instead of using sysenter
or whatever to call into the kernel directly, user-space makes a far call
to 64-bit code that uses sycall
.
That would be a lot less convenient if 32-bit code was running with a non-zero SS base. (I'm not sure whether the CS:EIP return address would be pushed at ss.base ESP
or at [RSP]
; if the latter, it would still be possible for the 64-bit code to actually return if memory was allocated at rsp
as well as ss.base rsp
.)