Dynamic assembly in THUMB or ARM mode-CodePudding

In GNU AS I understand it's possible to use Unified Syntax and sometimes get ARM code to automagically compile as Thumb code. In some cases this can produce impressive gains in code density and in many cases it just doesn't work because there's some ARM instruction that's impossible to do in Thumb mode.

What I'd like is for some way for GNU AS to "fall back" to ARM when attempting to compile a function block in Thumb mode. So, if there's an instruction that doesn't work in Thumb, it's compiled in ARM mode, but if works, then I get the code shrinkage.

Without having to annotate each of some 50,000 stub functions.

I've tried Googling and haven't found ANYTHING, so any help would be appreciated.

EDIT:

Thanks to the input I was able to get a makefile that tries to build Thumb first and falls back successfully to ARM. It's slow right now, but it works. Very pleased, thanks for all the input.

CodePudding user response：

It would be plausible for an assembler to have this feature, but GAS doesn't. GAS is designed as a one-pass assembler so it doesn't like to back-track. (For x86 it does do branch-displacement optimization, so it can make multiple passes over its internal data structure representing the code before emitting it. That might or might not be sufficient to add such a feature.)

There are some Thumb-only instructions like tbb. If a function can't be assembled in either Thumb or ARM mode, the assembler would have to tell you about it and error. But that would still be a useful and desirable behaviour for some use-cases. And for old code that was originally written for ARM-only, you wouldn't run into this problem.

Part of the problem would be knowing where functions end. The MASM-style proc / endp model would make that possible, but ARM assembly (GAS or Keil/ARMASM) doesn't do that. Instead just labels at the top of functions.

You could introduce a new directive like .auto_func (vs .thumb_func and .arm_func), and treat any of those three as boundaries between functions for this hypothetical feature.

You'd also want something to warn you when an innocent-seeming instruction caused a whole function to fall back to ARM, like an add r0, #123. (Instead of adds.)

That add is encodeable with a 32-bit Thumb-2 encoding, for CPUs that support Thumb-2. e.g. not Cortex-M0. M0 doesn't support ARM mode at all, but it's an easy CPU to remember as (mostly) not supporting Thumb 2 for testing how things assemble.

I originally misunderstood the question. The part of my answer below is pointing out that it's not viable (or a good idea) to have a mix inside a single function.

Switching between ARM and Thumb on a per-instruction basis inside functions would be impossible, or at best grossly inefficient if you did use instructions that aren't encodeable as 16-bit (or 32-bit Thumb-2 which significantly expands what you can do in Thumb mode).

Switching the CPU between decoding in Thumb and ARM modes requires a "thumb interworking" branch instruction like bx <reg> or blx <relative address>, so every ARM-only instruction would require two extra branch instructions (except when multiple ARM instructions are back to back. Or for a less naive assembler, when there's only 1 or 2 Thumb instructions betwene ARM instructions, don't bother switching).

So correctness (but not performance) is achievable for straight-line decoding (although maybe clobbering lr if there isn't a Thumb interworking branch that takes a relative address without also setting LR as a return address). If it requires clobbering a register like lr, that's not even really correct vs. the asm as written. You'd have to consider lr as being like MIPS $at (assembler temporary) that the assembler can use as a scratch while expanding your source-code pseudo-instructions into multiple machine instructions.

Conditional branches, and jump tables, can all work, possibly using it eq / blxeq <target> or something to emulate a beq if the target instruction is ARM and the branch is in a Thumb-mode block. Jump tables can take label addresses as addr 1 for thumb mode. But that would mean you couldn't use tbb and tbh instructions at all, unless every target was also in Thumb mode, because they don't do interworking and they'd need a register to emulate correctly.

So the only thing you'd have real trouble doing correctly would be computed jumps where some targets are in different modes. (Like add r0, pc, r1 / b r0). The assembler wouldn't be able to generate code to fix up the the address. So it's possible to write code that would defeat an attempt to use Thumb as much as possible.

Of course all of this is a non-starter for performance reasons, even if it was possible without clobbering lr, so working out the limits of achievable correctness has been a fun though experiment in silly computer tricks. :P

Doesn't hurt to ask, but turns out there's a good reason why you didn't find anything.