why does the expression `10 32` not use the stack while 10 -32 does?-CodePudding

so im self studying a compilers textbook and I'm getting the hang of it but I have a question regarding compilation of simple high level expressions, such as 52 -10, to x86-64 (AT&T).

consider the following expressions:

10 32 generates the following assembly:

       .globl main
main: 
       movq $10, %rax
       addq $32, %rax
       retq

but this next expression 52 -10 generates the following x86-64 assembly:

      .globl main
main:
      pushq %rbp
      movq %rsp, %rbp 
      subq $16, %rsp
      movq $10, -8(%rbp) 
      negq -8(%rbp)
      movq -8(%rbp), %rax 
      addq $52, %rax
      addq $16, %rsp
      popq %rbp
      retq

My understanding is the following. to compile a high level expression such as 52 -10 you need to remove complex operands so you need to make intermediate variables to compile; such as:

tmp_0 = -10
52   temp_0

So I am guessing that the difference lies in the fact that there's intermediate variables involved. In the book in relation to this example (ie 52 -10) it says:

We exhibit the use of memory for storing intermediate results in the next example. Figure 2.7 lists an x86 program that computes 52 -10. This program uses a region of memory called the procedure call stack (stack for short).

So im wondering why the expression 52 -10 uses the stack (due to the intermediate variable) and why 10 32 doesn't. The stack, a region of memory is used to store the intermediate variables but I want to know WHY.

Thanks.

CodePudding user response：

so, what I was missing is that the stack is the region of memory for storing local variable. for each function call more stack space is allocated. in my example this allocation corresponds to subq $16, %rsp. we need to allocate one variable so that's 8 bytes but it this number has to be divisible by 16 hence the 16.

take a look at this link to understand the basics :-) https://courses.engr.illinois.edu/cs225/fa2022/resources/stack-heap/

CodePudding user response：

Real-world compilers aren't that dumb, even when you tell them not to optimize (like clang -O0). They evaluate constant expressions at compile time to a single integer, because that's easier than carrying around the logic of all those operators throughout the work of transforming the program into assembly or machine code.

For example, even MSVC (Godbolt) compiles return 52 -10 to mov eax,42/ret, and that's a compiler that in debug builds will sometimes do insane things like compiling if(true);else to materializing a 1 in a register and comparing or testing it with itself. Instead of optimizing away the else side entirely like some other compilers, or at least using an unconditional jmp.

Compile-time eval is sometimes required in languages like C, for example static int arr[10 - 1]; is legal, and the size of a static array has to be a compile-time constant. Since the compiler needs to be able to do that, it makes sense to just always do it when simple, even without optimization enabled.

Part of the goal of gcc -O0 is to compile fast (not most simply / literally / naively), without caring about efficiency of the generated code. It's still faster to eval that integer expression soon after parsing than to carry it around and later generate machine code for it.

But if you have a truly naive compiler that chooses to be that inefficient:

Like Alexander commented, it's probably like C where -32 isn't a single constant, it's the unary operator - applied to the constant 32. (Fun fact, that's why INT_MIN isn't defined as -2147483648, that would have type long long.) It chooses not to do constant-propagation at compile-time to get a negative integer constant.

And this dumb compiler can't negate in a register as the first part of a larger expression. There's no reason to expect it to use stack space; since it's already working on the right hand operand of first, unlike in the earlier example, mov $10,