Home > OS >  Can't exactly understand the assembly code about how relocation works in u-boot
Can't exactly understand the assembly code about how relocation works in u-boot

Time:11-27

In the u-boot link script I see these lines (https://source.denx.de/u-boot/u-boot/-/blob/v2021.10/arch/arm/cpu/armv8/u-boot.lds near line 134).

.rel_dyn_start :
{
    *(.__rel_dyn_start)
}

.rela.dyn : {
    *(.rela*)
}

.rel_dyn_end :
{
    *(.__rel_dyn_end)
}

And this is a code doing relocation. (https://source.denx.de/u-boot/u-boot/-/blob/v2021.10/arch/arm/cpu/armv8/start.S near line 88)
But I can't exactly understand the following lines.

    adrp    x2, __rel_dyn_start     /* x2 <- Runtime &__rel_dyn_start */
    add     x2, x2, #:lo12:__rel_dyn_start
    adrp    x3, __rel_dyn_end       /* x3 <- Runtime &__rel_dyn_end */
    add     x3, x3, #:lo12:__rel_dyn_end
pie_fix_loop:
    ldp x0, x1, [x2], #16   /* (x0, x1) <- (Link location, fixup) */
    ldr x4, [x2], #8        /* x4 <- addend */
    cmp w1, #1027       /* relative fixup? */                         // <==== from here??
    bne pie_skip_reloc
    /* relative fix: store addend plus offset at dest location */
    add x0, x0, x9
    add x4, x4, x9
    str x4, [x0]
pie_skip_reloc:
    cmp x2, x3
    b.lo    pie_fix_loop

My first question is about adrp x2, __rel_dyn_start instruction.
From the linker script I see .__rel_dyn_start is the name of a section. But I couldn't find the variable named __rel_dyn_start(without the dot). Where is it coming from?

My second question is about the code from cmp w1, #1027 (marked <==== from here?? above).
I guess at this point, x0, x1 and x4 contain the first three 8-byte values in the .__rel_dyn_start section(assuming __rel_dyn_start is the first variable in the combined .rel_dyn_start section). And x9 seems to contain the offset for the relocation.
So in cmp w1, #1027, why does it compare w1 (the lower part of the second 8-byte value in the relocation section(?)) with #1027? and if it is equal, why does it add the relocation offset to x0 and x4 and store the copied value x4 to the new address x0? (maybe x0 and x4 was containg some kind of addrsses). I can't follow the next codes without understanding this.

Thank you for reading and I will be very grateful if someone could explain to me the main logic behind this and what all these codes are doing.

CodePudding user response:

  1. I think you are looking for "u-boot-spl.lds" (https://source.denx.de/u-boot/u-boot/-/blob/v2021.10/arch/arm/cpu/u-boot-spl.lds).
    At least that file contains named variable.
    .rel.dyn : {
        __rel_dyn_start = .;  // <<== assigned with starting address of section
        *(.rel*)
        __rel_dyn_end = .;
    }

2. I don't know, could only speculate.
That whole loop looks like copying data from original location to 'relocated' area, where others would look for it. Respectively since location is different addresses are adjusted respectively.
`#1027` might be chosen arbitrary. Or that might be max relative offset that could be encoded into instruction and chosen based on that restriction.
Take it with a spoon of salt though.

CodePudding user response:

I think now I can understand what the code is doing(for the missing __rel_dyn_start and __rel_dyn_end variable, please see user3124812's answer and my comment to it).
I found in https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/elf.h#L182, there is this Elf64_Rela structure.

typedef struct elf64_rela {
  Elf64_Addr r_offset;  /* Location at which to apply the action */
  Elf64_Xword r_info;   /* index and type of relocation */
  Elf64_Sxword r_addend;    /* Constant addend used to compute value */
} Elf64_Rela;

r_offset is the address requiring relocation (the adress should be changed to point to the actual location on memory), r_info is the relocation type (doc says many types, arch specific..) and r_addend is the value added to the value (symbol refernece) in address r_offset. The linker only knows the relative address in the segment but the dynamic loader should add some more value to it after the program is loaded on th memory because the loaded address is not always the same as the link-time address.
This Elf64_structure seems to be repeated in the __rel_dyn section for all the symbols requring relocation. And the code reads this structure to x0, x1 and x4 register each for r_offset, r_info and r_addened. And if the lower 16bit of rinfo (w1, which is lower 16 bit of x1) is #1027 (I guess it indicates relative fixup type, as the comment says) then the symbol reference location is overwritten with the (link time offset r_addend, which is determined when the program was built, relative from segment start 0) (offset between link time address and actual loaded address). The code repeats this for all the Elf64_Rela data in the rel_dyn section.
So The program is not being actually copied to new location but only all the references are overwritten to match the link time address with the actual execution time address.

  • Related