In this simple program, I get a relocation in main
for compute
, but not for compute2
:
static int compute2()
{
return 2;
}
int compute()
{
return 1;
}
int main()
{
return compute() compute2();
}
I compile this with gcc -c main.cpp
using gcc 11.2.0 on Ubuntu 21.10.
Here's what objdump
says about main
:
000000000000001e <main>:
1e: f3 0f 1e fa endbr64
22: 55 push rbp
23: 48 89 e5 mov rbp,rsp
26: 53 push rbx
27: e8 00 00 00 00 call 2c <main 0xe> 28: R_X86_64_PLT32 compute()-0x4
2c: 89 c3 mov ebx,eax
2e: e8 cd ff ff ff call 0 <compute2()>
33: 01 d8 add eax,ebx
35: 48 8b 5d f8 mov rbx,QWORD PTR [rbp-0x8]
39: c9 leave
3a: c3 ret
As you can see, for the call to compute2
(internal linkage) there is a relative jump with no relocation. But for the call to compute
(external linkage) there is a relocation, even if all three functions are in the same section in the same object file.
Why is that relocation needed? I thought the linker would never split up a section, so no matter where this section gets loaded, relative addresses should still be the same? Why does linkage seemingly affect this?
CodePudding user response:
It's not that a relocation is needed per se, it's that the compiler chooses to do indirection through the PLT (because of possible symbol interposition, or in case the main executable or an earlier shared lib define the symbol). Note the relocation type R_X86_64_PLT32
.
If you look at the compiler's asm output (not disassembly of the .o), you'd see call compute@plt
.
A static
function definitely always uses the definition in the same translation unit, but other definitions of global symbols can take precedence.
This should only be happening for -fPIC
, not for building the main executable itself (-fPIE
is on by default in most modern distros), for symbols defined in the same .c (translation unit).
https://godbolt.org/z/qYYWsYf6a
shows GCC -fPIE
still using call compute
. Apparently Ubuntu enables some other options that make this different? (Godbolt's gcc doesn't enable-by-default several things that most distros do, so you need some options to match how GCC is configured on Ubuntu. -fstack-protector-strong
isn't relevant, and IDK what else would be.)
Note that when linking an executable (not a shared lib), the call should get "relaxed" to a direct call that doesn't go through the PLT. So it's ok for GCC to emit all calls as call foo@plt
.
If you were using -fno-plt
as well, calls would be emitted as call *foo@gotplt(%rip)
, which takes 6 bytes, so relaxing it to a direct 5-byte call rel32 needs a byte of filler; ld uses a meaningless address-size prefix. (See my answer on Can't call C standard library function on 64-bit Linux from assembly (yasm) code for an example.)
If you don't want this PLT indirection in the first place, you can set ELF visibility = hidden for that symbol. This is a really good idea when making a shared library, since in that case the linker won't be able to relax all the indirection through the PLT for internal calls to functions you don't intend to allow symbol-interposition for.
You can use -fvisibility=hidden
to make that the default for all prototypes, so calls will use call rel32
, not indirect through the PLT (or GOT with -fno-plt
). Then for any function or variable a shared library does want to export, use __attribute__((visibility("default")))
For your case, -fvisibility=hidden
may solve the problem you're having, with GCC unnecessarily indirecting even though you're not building code that can go into a shared library (with -fPIC
).
See also
- https://gcc.gnu.org/wiki/Visibility
- How to use the __attribute__((visibility("default")))?
- https://unix.stackexchange.com/questions/472660/what-are-difference-between-the-elf-symbol-visibility-levels
- Can't call C standard library function on 64-bit Linux from assembly (yasm) code (link-time relaxation example when linking an executable rather than a shared lib.)
- Sorry state of dynamic libraries on Linux on Thiago Macieira's blog, from 2012. (Before
-fno-plt
existed; so at least that idea has been implemented, and is now the default for some distros binary packages, like Arch GNU/Linux.)
CodePudding user response:
I believe this behavior is implemented to enable symbol interposition – by exposing the compute
call as a relocatable opcode, you can run your code like
> LD_PRELOAD=custom_compute.so ./main
and your compute
call will be relocated to a custom compute
function defined in the .so
.
This functionality is disabled for static functions like compute2
- which are internally linked and shouldn't be available for symbol interposition.
As mentioned in comments, this behavior is not just for LD_PRELOAD
but is more generally relevant for shared libraries - for instance, in this example, if two shared libraries were to be loaded, both defining compute
- the second library's call to compute
would be relocated to the first library's function.