I'm learning some basics about linking and encountered the following code.
file: f1.c
#include <stdio.h>
static int foo;
int main() {
int *bar();
printf("%ld\n", bar() - &foo);
return 0;
}
file: f2.c
int foo = 0;
int *bar() {
return &foo;
}
Then a problem asks me whether this statement is correct: No matter how the program is compiled, linked or run, the output of it must be a constant (with respect to multiple runs) and it is non-zero.
I think this is correct. Although there are two definitions of foo
, one of them is declared with static
, so it shadows the global foo
, thus the linker will not pick only one foo
. Since the relative position of variables should be fixed when run (although the absolute addresses can vary), the output must be a constant.
I experimented with the code and on gcc 7.5.0
with gcc f1.c f2.c -o test && ./test
it would always output 1
(but if I remove the static
, it would output 0
). But the answer says that the statement above is wrong. I wonder why. Are there any mistakes in my understanding?
A result of objdump
follows. Both foo
s go to .bss
.
Context. This is a problem related to the linking chapter of Computer Systems: A Programmer's Perspective by Randal E. Bryant and David R. O'Hallaron. But it does not come from the book.
Update. OK now I've found out the reason. If we swap the order and compile as gcc f2.c f1.c -o test && ./test
, it will output -1
. Quite a boring problem...
CodePudding user response:
Indeed the static variable foo
in the f1.c module is a different object from the global foo
in the f2.c module referred to by the bar()
function. Hence the output should be non zero.
Note however that subtracting 2 pointers that do not point to the same array or one past the end of the same array is meaningless, hence the difference might be 0
even for different objects. This may happen even as &foo == bar()
would be non 0
because the objects are different. This behavior was common place in 16-bit segmented systems using the large model where subtracting pointers only affected the offset portion of the pointers whereas comparing them for equality compared both the segment and the offset parts. Modern systems have a more regular architecture where everything is in the same address space. Just be aware that not every system is a linux PC.
Furthermore, the printf
conversion format %ld
expects a value of type long
whereas you pass a value of type ptrdiff_t
which may be a different type (namely 64-bit long long
on Windows 64-bit targets for example, which is different from 32-bit long there). Either use the correct format %td
or cast the argument as (long)(bar() - &foo)
.
Finally, nothing in the C language guarantees that the difference between the addresses of global objects be constant across different runs of the same program. Many modern systems perform address space randomisation to lessen the risk of successful attacks, leading to different addresses for stack objects and/or static data in successive runs of the same executable.
CodePudding user response:
Abstracting from the wring printf formats and pointer arithmetic problems static
global variable from one compilation unit will be different than static
and non-static
variables having that same name in other compilation units.
to correctly see the difference in char
s you should cast both to char pointer and use %td
format which will print ptrdiff_t
type. If your platform does not support it, cast the result to long long int
int main() {
int *bar();
printf("%td\n", (char *)bar() - (char *)&foo);
return 0;
}
or
printf("%lld\n", (long long)((char *)bar() - (char *)&foo));
If you want to store this difference in the variable use ptrdiff_t
type:
ptrdiff_t diff = (char *)bar() - (char *)&foo;