Home > other >  How does a linker relocate branch instructions in MIPS?
How does a linker relocate branch instructions in MIPS?

Time:03-07

Background

I'm working on a 2015 CS61C (Berkeley) course project on writing a linker to link object files generated from the following subset of the MIPS instruction set.

Add Unsigned: addu $rd, $rs, $rt
Or: or $rd, $rs, $rt
Set Less Than:  slt $rd, $rs, $rt
Set Less Than Unsigned: sltu $rd, $rs, $rt
Jump Register:  jr $rs
Shift Left Logical: sll $rd, $rt, shamt
Add Immediate Unsigned: addiu $rt, $rs, immediate
Or Immediate:   ori $rt, $rs, immediate
Load Upper Immediate:   lui $rt, immediate
Load Byte:  lb $rt, offset($rs)
Load Byte Unsigned: lbu $rt, offset($rs)
Load Word:  lw $rt, offset($rs)
Store Byte: sb $rt, offset($rs)
Store Word: sw $rt, offset($rs)
Branch on Equal:    beq $rs, $rt, label
Branch on Not Equal:    bne $rs, $rt, label
Jump:   j label
Jump and Link:  jal label
Load Immediate: li $rt, immediate
Branch on Less Than:    blt $rs, $rt, label

From this subset of instructions, I think the ones that need relocation are j, bne, beq instructions (blt is a pseudo-instruction), the latter two needing to be relocated if the label is not present in the same file.

The comments of the MIPS function that does the relocation of an instruction reads

#------------------------------------------------------------------------------
# function relocate_inst()
#------------------------------------------------------------------------------
# Given an instruction that needs relocation, relocates the instruction based
# on the given symbol and relocation table.
#
# You should return error if 1) the addr is not in the relocation table or
# 2) the symbol name is not in the symbol table. You may assume otherwise the 
# relocation will happen successfully.
#
# Arguments:
#  $a0 = an instruction that needs relocating
#  $a1 = the byte offset of the instruction in the current file
#  $a2 = the symbol table
#  $a3 = the relocation table
#
# Returns: the relocated instruction, or -1 if error

Note that the relocation table contains addresses relative to the start of the object file being linked, while the symbol table is an aggregate of the symbol tables of all the object files being linked and contains absolute addresses.

Problem

  • If the instruction to be relocated is a j instruction, since $a1 contains the relative address of the instruction, we find the label that needs to be relocated in the relocation table, and then find the absolute address for that label in the symbol table. We can than add (absolute address >> 2) as the low 26 bits of the instruction.

  • If the instruction to be relocated is bne, or beq however, I am not sure what to do, since the low order bits are supposed to be relative to PC 4, but we don't know what the absolute address of the instruction being relocated is, so we don't know what PC 4 is.

Looking at various solutions online, it seems that only j relocations are handled.

Am I missing something?

EDIT: We are only considering the text segment.

CodePudding user response:

My guess is that this linker does not handle branch instructions (bne or beq) to external labels.

This will preclude using beq label where label is external (global and in another object file), but this is only really possible to do in assembly.

Compiler output, for example, will have both the branch instruction and target location all within a single function, which goes into a single code chunk. (modulo certain tail call optimization).

With that limitation, then all bne and beq instructions are already fixed up by the compiler or assembler, using pc-relative addressing — there would be no need for an entry in the relocation table for these.

Further, the range of the branch (beq/bne) instructions ( /-128k) is shorter than for j, so if the linker were really intending to support branching to external label, it might also have to provide the capability to introduce branch islands to handle the ones that are branching too far away.


To expand on your example:

if ( a1 == a0 )
    printf ("hello")

would be

    bne a1, a0, endIf1
    la a0, Lhello
    jal printf
endIf1:

Some compilers don't know which function is in what DLL's so, even if printf was in a DLL, the compiler output could still look the same.

  • Related