Home > Software design >  Why does linkage affect whether relocations are needed for relative jumps in the same section?
Why does linkage affect whether relocations are needed for relative jumps in the same section?

Time:09-18

In this simple program, I get a relocation in main for compute, but not for compute2:

static int compute2()
{
    return 2;
}

int compute()
{
    return 1;
}


int main()
{
    return compute()   compute2();
}

I compile this with gcc -c main.cpp using gcc 11.2.0 on Ubuntu 21.10.

Here's what objdump says about main:

000000000000001e <main>:
  1e:   f3 0f 1e fa             endbr64 
  22:   55                      push   rbp
  23:   48 89 e5                mov    rbp,rsp
  26:   53                      push   rbx
  27:   e8 00 00 00 00          call   2c <main 0xe>    28: R_X86_64_PLT32  compute()-0x4
  2c:   89 c3                   mov    ebx,eax
  2e:   e8 cd ff ff ff          call   0 <compute2()>
  33:   01 d8                   add    eax,ebx
  35:   48 8b 5d f8             mov    rbx,QWORD PTR [rbp-0x8]
  39:   c9                      leave  
  3a:   c3                      ret    

As you can see, for the call to compute2 (internal linkage) there is a relative jump with no relocation. But for the call to compute (external linkage) there is a relocation, even if all three functions are in the same section in the same object file.

Why is that relocation needed? I thought the linker would never split up a section, so no matter where this section gets loaded, relative addresses should still be the same? Why does linkage seemingly affect this?

CodePudding user response:

It's not that a relocation is needed per se, it's that the compiler chooses to do indirection through the PLT (because of possible symbol interposition, or in case the main executable or an earlier shared lib define the symbol). Note the relocation type R_X86_64_PLT32.

If you look at the compiler's asm output (not disassembly of the .o), you'd see call compute@plt.

A static function definitely always uses the definition in the same translation unit, but other definitions of global symbols can take precedence.


This should only be happening for -fPIC, not for building the main executable itself (-fPIE is on by default in most modern distros), for symbols defined in the same .c (translation unit).

https://godbolt.org/z/qYYWsYf6a shows GCC -fPIE still using call compute. Apparently Ubuntu enables some other options that make this different? (Godbolt's gcc doesn't enable-by-default several things that most distros do, so you need some options to match how GCC is configured on Ubuntu. -fstack-protector-strong isn't relevant, and IDK what else would be.)

Note that when linking an executable (not a shared lib), the call should get "relaxed" to a direct call that doesn't go through the PLT. So it's ok for GCC to emit all calls as call foo@plt.

If you were using -fno-plt as well, calls would be emitted as call *foo@gotplt(%rip), which takes 6 bytes, so relaxing it to a direct 5-byte call rel32 needs a byte of filler; ld uses a meaningless address-size prefix. (See my answer on Can't call C standard library function on 64-bit Linux from assembly (yasm) code for an example.)


If you don't want this PLT indirection in the first place, you can set ELF visibility = hidden for that symbol. This is a really good idea when making a shared library, since in that case the linker won't be able to relax all the indirection through the PLT for internal calls to functions you don't intend to allow symbol-interposition for.

You can use -fvisibility=hidden to make that the default for all prototypes, so calls will use call rel32, not indirect through the PLT (or GOT with -fno-plt). Then for any function or variable a shared library does want to export, use __attribute__((visibility("default")))

For your case, -fvisibility=hidden may solve the problem you're having, with GCC unnecessarily indirecting even though you're not building code that can go into a shared library (with -fPIC).

See also

CodePudding user response:

I believe this behavior is implemented to enable symbol interposition – by exposing the compute call as a relocatable opcode, you can run your code like

> LD_PRELOAD=custom_compute.so ./main

and your compute call will be relocated to a custom compute function defined in the .so.


This functionality is disabled for static functions like compute2 - which are internally linked and shouldn't be available for symbol interposition.


As mentioned in comments, this behavior is not just for LD_PRELOAD but is more generally relevant for shared libraries - for instance, in this example, if two shared libraries were to be loaded, both defining compute - the second library's call to compute would be relocated to the first library's function.

  • Related