The placement of static global and global variables with the same identifier-CodePudding

I'm learning some basics about linking and encountered the following code.

file: f1.c

#include <stdio.h>

static int foo;

int main() {
    int *bar();
    printf("%ld\n", bar() - &foo);
    return 0;
}

file: f2.c

int foo = 0;
int *bar() {
    return &foo;
}

Then a problem asks me whether this statement is correct: No matter how the program is compiled, linked or run, the output of it must be a constant (with respect to multiple runs) and it is non-zero.

I think this is correct. Although there are two definitions of foo, one of them is declared with static, so it shadows the global foo, thus the linker will not pick only one foo. Since the relative position of variables should be fixed when run (although the absolute addresses can vary), the output must be a constant.

I experimented with the code and on gcc 7.5.0 with gcc f1.c f2.c -o test && ./test it would always output 1 (but if I remove the static, it would output 0). But the answer says that the statement above is wrong. I wonder why. Are there any mistakes in my understanding?

A result of objdump follows. Both foos go to .bss.

Context. This is a problem related to the linking chapter of Computer Systems: A Programmer's Perspective by Randal E. Bryant and David R. O'Hallaron. But it does not come from the book.

Update. OK now I've found out the reason. If we swap the order and compile as gcc f2.c f1.c -o test && ./test, it will output -1. Quite a boring problem...

CodePudding user response：

Indeed the static variable foo in the f1.c module is a different object from the global foo in the f2.c module referred to by the bar() function. Hence the output should be non zero.

Note however that subtracting 2 pointers that do not point to the same array or one past the end of the same array is meaningless, hence the difference might be 0 even for different objects. This may happen even as &foo == bar() would be non 0 because the objects are different. This behavior was common place in 16-bit segmented systems using the large model where subtracting pointers only affected the offset portion of the pointers whereas comparing them for equality compared both the segment and the offset parts. Modern systems have a more regular architecture where everything is in the same address space. Just be aware that not every system is a linux PC.

Furthermore, the printf conversion format %ld expects a value of type long whereas you pass a value of type ptrdiff_t which may be a different type (namely 64-bit long long on Windows 64-bit targets for example, which is different from 32-bit long there). Either use the correct format %td or cast the argument as (long)(bar() - &foo).

Finally, nothing in the C language guarantees that the difference between the addresses of global objects be constant across different runs of the same program. Many modern systems perform address space randomisation to lessen the risk of successful attacks, leading to different addresses for stack objects and/or static data in successive runs of the same executable.

CodePudding user response：

Abstracting from the wring printf formats and pointer arithmetic problems static global variable from one compilation unit will be different than static and non-static variables having that same name in other compilation units.

to correctly see the difference in chars you should cast both to char pointer and use %td format which will print ptrdiff_t type. If your platform does not support it, cast the result to long long int

int main() {
    int *bar();
    printf("%td\n", (char *)bar() - (char *)&foo);
    return 0;
}

printf("%lld\n", (long long)((char *)bar() - (char *)&foo));

If you want to store this difference in the variable use ptrdiff_t type:

ptrdiff_t diff = (char *)bar() - (char *)&foo;