Why MOV instruction is replaced by ADD instruction-CodePudding

I have the following instruction: mov r1, r7 in my assembly code but after looking into disassembly, I've found that actual generated code was adds r1, r7, #0

I checked with ARMv6-M Architecture Reference Manual and I found out that there's MOVS <Rd>,<Rm> instruction (A6.7.40) which is different from ADDS.

While that's not a big issue, I'm still puzzled why assembler replaces code that I wrote by different op-codes. According to the book that I'm reading, all non-jump instructions take 1 cycle (and I'd prefer for assembler to be dumb rather than trying to optimize something for me).

I'm using Raspberry Pi Pico SDK which uses GNU Assembler, AFAIK.

All my code is written in helloworld.S, full source code is:

.thumb_func
.global main

main:
mov r7, #0
bl stdio_init_all
loop:
ldr r0, =helloworld
add r7, #1
mov r1, r7
bl printf
mov r0, #250
bl sleep_ms
b loop
.data
.align 4
helloworld: .asciz "Hello World %d\n"

CodePudding user response：

You are writing Thumb mode assembly code and the RPi Pico supports Thumb-2, which means the actual output will be in UAL format.

If you refer to the ARMv6-M reference manual, table D4-1 specifies the conversion of pre-UAL Thumb syntax into UAL syntax, notably:

Pre-UAL Thumb syntax	Equivalent UAL Syntax	Notes
`MOV <Rd>, <Rm>`	`ADDS <Rd>, <Rm>, #0`	If `<Rd>` and `<Rm>` are both R0-R7.
	`MOV <Rd>, <Rm>`	Otherwise.

As is suggested by Tom, you can add .syntax unified if you want to write UAL code (which will then be actually assembled as-is).

CodePudding user response：

Can I suggest that you add:

.syntax unified

at the start of your file? After that you will have to explicitly add suffixes to all your instructions to mark whether they set the flags and whether they are conditional, but you may find it easier to have explicit control of the opcodes used.

Here is your code in the compiler explorer: https://godbolt.org/z/zo8nnc9sh

CodePudding user response：

Thumb instructions are 16-bits long which isn't a lot of bits. Since both mov rx, ry and adds rx, ry, #0 do the same thing and adds rx, ry, #0 is more general, only the adds version is supported and mov is converted by the assembler to *adds". So mov essentially becomes a pseudo-instruction.

Other ISAs do similar things. RISC-V doesn't have a mv instruction in the RV32I ISA. It is similarly encoded as addi. The point is to save opcode encoding space, a limited resource.