Are additional opcodes in RISC-V instructions important?-CodePudding

RISC-V was designed so that all instructions would be the same length, hence the existence of different types of instruction formats (I-type, R-type, S-type, etc.)

R-format follows this pattern - 7 bits funct7, 5 bits rs2, 5 bits rs1, 3 bits funct3, 5 bits rd, and 7 bits opcode.

Whereas, I-format follows this pattern - 12 bits immediate, 5 bits rs1, 3 bits funct3, 5 bits rd, and 7 bits opcode.

My question is that, considering the fact that the operation type is pretty much determined by the opcode, what exactly are the usage of additional opcodes in funct3, funct7, etc.?

My theory is they are included simply to make all instructions equal in size but I'm likely to be wrong.

CodePudding user response：

RISC-V was designed so that all instructions would be the same length.

This is incorrect.

RISC V allows for variable length instructions, in increments of 2 bytes — so 2 bytes, 4 bytes, 6 bytes, more.

The lowest byte of the instruction (mostly: at least for up to 64-bit instructions), within the 7 bit opcode field, indicates the size of the instruction, which is a much more rational scheme than that used by other instruction sets (as comparted to the potential for multiple prefix bytes/instructions, where the full instruction size depends on the rest of the instruction).

Among the lowest 2 bits a value of 00, 01, or 10 means 16 bit instruction, while if the lowest 2 bits are 11, that means 32-bit instruction — unless the next 3 bits are 111, which is used to encode the larger instruction sizes, and then the true size uses more (opcode or other) bits.

My theory is they are included simply to make all instructions equal in size but I'm likely to be wrong.

Instructions sizes do need to round to a multiple of 2 bytes so, yes, 32 bits is a sweet spot, in some sense. However, rest assured that these func fields are well considered for various purposes (like custom instructions), and not just padding to get to 32 bits!

CodePudding user response：

RISC-V is a simple ISA with lots of coding space left for extensions (new instructions). That was intentional.

Having fixed-width instructions is pretty central to the original RISC idea of being simple to pipeline and decode in parallel. In a classic RISC the fetch stage is incrementing the program counter by 4 to fetch the next instruction in parallel with the last one being decoded.

With a good format for 16-bit compressed instructions like RV32C or ARM Thumb, pipelining a mix of 16 and 32-bit instructions isn't a problem; you fetch in wide contiguous chunks and pull instructions out of a buffer. It's still easy and fast to find instruction boundaries. Modern transistor budgets, even for microcontrollers, are larger than when MIPS was first being designed, and above that even an in-order superscalar CPU has variable instructions-per-cycle, so having a fetch buffer isn't a problem. See also Modern Microprocessors A 90-Minute Guide! for some pipelining and superscalar (more than 1 instruction per cycle) basics.

Having each instruction be a different number of bits, like a bitstream that wasn't aligned even to byte boundaries, would be very inconvenient, especially for branching. It's no surprise nothing does that. But even with byte alignment, x86's variable-length instructions are a huge pain to decode, and cost significant power. (And are so troublesome that modern CPUs cache the decode results.)

RISC-V was designed by the same lead designer as MIPS, where all R-type instructions have the same "opcode" field, using the "funct" field to control the ALU. (I-type and J-type instructions, and RISC-V U-type, don't have room elsewhere, so the opcode field is actually the opcode in the normal sense, determining which instructions.)

RISC-V funct3 and funct7 aren't really separate, they're a 10-bit field that just happens not to be contiguous. The RISC-V manual has an explanation for why it spreads around immediates and other things into separate fields: This allows other fields to always be in the same place in all formats when they're present at all, saving some muxes in the decoders. Other fields get spaced out around them. RISC-V: Immediate Encoding Variants has a few answers, including one which quotes those docs.

RISC-V I-type instructions only have a funct3 field, no funct7, so in that case you only have a 3-bit field. It probably makes sense to use opcode funct3 bits to specify which operation the ALU should do, so addi vs. add can decode as similarly as possible. (I haven't looked at exactly how RISC-V lays out its opcodes).