RISC-V assembler is replacing beq instructions by bne jal-CodePudding

I have this RISC-V assembly program:

addi        x2,x0,5
addi        x3,x0,12
addi        x7,x3,-9
or          x4,x7,x2
and         x5,x3,x4
add         x5,x5,x4
beq         x5,x7,0x48

I want to get the assembled instructions in hex. format to load them in an FPGA. I get those values by executing the following:

#!/bin/bash

# Given a source assembly file, assembly it and display
# the hex values for each instruction

SRC_FILE=$1

RV_AS=riscv64-unknown-elf-as
RV_OBJCOPY=riscv64-unknown-elf-objcopy

$RV_AS -o /tmp/gen_asm_instr.elf $SRC_FILE -march=rv32ima
$RV_OBJCOPY -O binary /tmp/gen_asm_instr.elf /tmp/gen_asm_instr.bin
xxd -e -c 4 /tmp/gen_asm_instr.bin | cut -d ' ' -f 2

If I comment out the last assembly instruction (beq), everything works. I get the following result:

Those are 6 instructions, everything fine. However, if I uncomment the last instruction, I get:

Those are 8 instructions. If I "dis-assemble" the above, I get:

Dis-assemble

# Create a file 'template.txt' with the above instructions:
00000000: 00500113 00c00193 ff718393 0023e233
00000010: 0041f2b3 004282b3 00729463 0000006f

# Use xxd and obdjump to recover the assembly instructions
xxd -r template.txt > a.out # generate a binary file
xxd -e a.out > a-big.txt    # switch endianness
xxd -r ./a-big.txt > a2.out # generate a bin. file with the switched endianness
riscv64-unknown-elf-objdump -M no-aliases -M numeric -mabi=ilp32 -b binary -m riscv -D ./a2.out    # dis-assemble it

Result:

./a2.out:     file format binary


Disassembly of section .data:

0000000000000000 <.data>:
   0:   00500113            addi    x2,x0,5
   4:   00c00193            addi    x3,x0,12
   8:   ff718393            addi    x7,x3,-9
   c:   0023e233            or  x4,x7,x2
  10:   0041f2b3            and x5,x3,x4
  14:   004282b3            add x5,x5,x4
  18:   00729463            bne x5,x7,0x20
  1c:   0000006f            jal x0,0x1c

So the RISC-V assembler is transforming the beq instruction in two: bne and jal.

Why this happens? How can I avoid it?

EDIT

I have tried with this online assembler:

https://riscvasm.lucasteske.dev/

and the same happens.

CodePudding user response：

The build system seems to do that when using a hard-coded numeric address for the branch target. Can't explain why it chooses to do that but I will note that that jal has a much farther reach (20 bit immediate) than beq (12 bit immediate). As both bxx and jal are PC-relative, neither supports absolute addressing. The assembler might not know where the code will be located, and so is giving you additional range to reach that absolute address.

If you use a label as branch target it won't do that when within reach.