Why should certain registers be saved? What could go wrong if not? [duplicate]-CodePudding

Using assembly x86_64 in MASM, there are some registers whose value must be saved before using; they must be pushed to the stack. These are: RBP, RBX, RSP, R12, R13, R14, and R15. Taking for example RBX, why should it be saved?

What consequences could it have to simply use it as it is, not saving it?

And if there is a consequence is it something that affects the program I'm currently running or can It affect something else?

Are the registers used outside of assembly? Meaning is the CPU using those registers without any user manipulation in assembly, or for example when the user is launching a program (video game or whatever), or a process is being used?

CodePudding user response：

A calling convention is how functions can call each other and pass/return args without stepping on each others' toes. That includes compiler-generate code calling your hand-written function: what it's allowed assume about register contents after your function returns.

See What are callee and caller saved registers? for more about call-preserved vs. call-clobbered registers, and how a calling convention with some of each lets you (or a compiler) create efficient asm.

Why does Windows64 use a different calling convention from all other OSes on x86-64? discusses the advantages of the x86-64 System V calling convention, and how / why its design choices were made. See also this re: why (some) args are passed in registers.

What happens when you violate the calling convention

The problems you'd have from clobbering call-preserved registers would be like Is function call messing with other registers than %rax?, but would affect non-buggy callers that were keeping their locals in call-preserved registers. (Unlike in that and several other SO questions where people tried to call a function in a loop but kept their loop vars in registers that functions are allowed to clobber).

e.g. a loop like this might be infinite, or end right away, or even crash if it was a debug build using RBP as a frame pointer and you clobbered it. Basically, imagine the consequences of having a call to your function step on any of your caller's local variables.

  for (int i=0; i<10; i  ) {
     int tmp = my_handwritten_asm_function(i);
     printf ("%d %d\n", i, tmp);
  }

Since there are function calls in the loop, the compiler will keep i in a call-preserved register like EBX (if it doesn't fully unroll the loop and must mov ecx, 1 / call / mov ecx, 2 / call etc.) If your asm function modifies EBX, that's obviously bad.

(Compiler optimizations could make the function call happen with things in registers you wouldn't expect from a simple transliteration to machine code, since it's allowed to assume that its local variables are private and its call-preserved registers won't be stepped on.)

Of course, some callers may not depend on a certain register value, especially things like the code that calls your main. So it's not rare to be able to get away with violating the calling convention / ABI in a hand-written main.

Not breaking is not evidence of correctness; especially in assembly language, it's not rare for dangerous code to "happen to work", but still be broken in ways that would be a problem with different surrounding code.

Are the registers used outside of assembly?

Everything the CPU runs is assembly language, there is no outside. (Actually machine code, but corresponds approximately 1:1 with assembly). Often that's compiler-generated code from high-level languages. See How to remove "noise" from GCC/clang assembly output? for more about looking at compiler-generated asm.

But CPU registers are private to each thread under a multi-tasking OS; the OS's context-switches effectively virtualize them. So only code that actually calls your function (directly or indirectly) would be affected.

CodePudding user response：

To add to @Peter's excellent answer:

Oversimplifying a bit, a program tells the processor what to do by giving it a sequence of machine code instructions to execute, one after the other, which the processor does very quickly but totally trusting the program to do something useful without knowing or caring what that is going to be.

As these machine code instructions execute, they change the state of the process (a process is the execution environment for these instructions) and that is how one instruction communicates with another; this eventually builds program answers/results piece by piece. The CPU registers are fundamental to that process state, and, most short term modifications of the process state happen there in those CPU registers. One instruction modifies a register and the next instruction looks at that register for the next part of the program. All under control of the program and its sequence of machine code instructions.

There is a part of the process that incorporates the values for all the CPU registers, and it is called a thread. Operating systems virtualize the CPU so each thread appears to have its own set of CPU registers. Continuity of the registers and their values is crucial to the thread, since that state is used to communicate between machine code instructions to build larger results piece by piece.

One thing that the machine code program can do is call functions. When one function calls another function, the caller is effectively suspended waiting for the called function to complete and return its answer/data. Once a called function returns to its caller, the suspended caller is resumed, and continues with its own programming found after that function call.

In order for the caller to be able to resume properly, its suspended state must be intact. Otherwise, it may resume with incorrect variable values, which would lead to bad behaviors.

When one function calls another, that function may still call yet more functions, and so on. To support this, the thread has a notion of a call stack, and there is the concept of the currently suspended functions waiting for the most recent one to finish — this is the notion of the call chain. Functions in the call chain (which are suspended) are relying on the preserved registers, and their values constitute important state for those suspended functions. However, since the CPU registers number in the small dozen, yet functions perhaps in the tens of thousands in any given program, it is necessary for functions to reuse CPU registers that are already in use by suspended functions — though as long the in-use registers are returned to their same values, the callers will be none-the-wiser and will resume as desired.

So, if a callee fails to properly preserve the preserved registers, then some bit of state for some caller will be modified unexpectedly, and anything could happen as a result, b/c it would have to do with the specific programming of some suspended function whose state was unexpectedly modified.