Home > Net >  Why can variables be initialized(and used) without its declaration and definition "being run&qu
Why can variables be initialized(and used) without its declaration and definition "being run&qu

Time:02-24

C disallows "goto-ing over a definition:"

goto jumpover;
int something = 3;
jumpover:
std::cout << something << std::endl;

This will raise an error as expected, because "something" won't be declared(or defined).

However, I jumped over using assembly code:

#include<iostream>
using namespace std;
int main(){
    asm("\njmp tag\n");
    int ptr=9000;//jumped over
    cout << "Ran" << endl;
    asm("\ntag:\n");
    cout << ptr << endl;
    return 0;
}

It printed 9000, although the int ptr=9000;//jumped over line is NOT executed, because the program did not print Ran. I expected it would cause a memory corruption/undefined value when ptr is used, because the memory isn't allocated(although the compiler thinks it is,because it does not understand ASM). How can it know ptr is 9000?

Does that mean ptr is created and assigned at the start of main()(therefore not skipped,due to some optimizations or whatever) or some other reason?

CodePudding user response:

Jumping between asm() statements is not supported by GCC;
your code has undefined behaviour.
Literally anything is allowed to happen.

There's no __builtin_unreachable() after it, and you didn't even use asm goto("" ::: : "label") (GCC manual) to tell it about a C label the asm statement might or might not jump to.

Whatever happens in practice with different versions of gcc/clang and different optimization levels when you do that is a coincidence / implementation detail / result of whatever the optimizer actually did.

For example, with optimization enabled it would do constant-propagation assuming that the int ptr=9000; statement would be reached, because it's allowed to assume that execution comes out the end of the first asm statement.

You'd have to look at the compiler's full asm output to see what actually happened. e.g. https://godbolt.org/z/MbGhEnK3b shows GCC -O0 and -O2. With -O0 you do indeed get it reading uninitialized stack space since it jumps over a mov DWORD PTR [rbp-4], 9000, and with -O2 you get constant-propagation: mov esi, 9000 before the call std::basic_ostream<char,... operator <<(int) overload.

because the memory isn't allocated

Space for it actually is allocated in the function prologue; compilers don't generate code to move the stack pointer every time they encounter a declaration inside a scope. They allocate space once at the start of a function. Even the one-pass Tiny C Compiler works this way, not using a separate push to alloc init separate int vars. (This is actually a missed optimization in some cases when push would be useful to alloc init in one instruction: What C/C compiler can use push pop instructions for creating local variables, instead of just increasing esp once?)


Even moreso than most other kinds of C undefined behaviour, this is not something the compiler can actually detect at run-time to warn you about. asm statements just insert text into GCC's asm output which is fed to the assembler. You need to accurately describe to the compiler what the asm does (using constraints and things like asm goto) to give the compiler enough information to generate correct code around your asm statement.

GCC does not parse the instructions in the asm template, it just copies it directly to the asm output. (Or for Extended asm, substitutes the %0, %1 etc. operands with text generated according to the operand constraints.)

  • Related