So I am a beginner in mips assembly and I know how to use the basic instructions such as lw,sw,addi etc but I can't really understand how these together can form a single function. If let's say I want to write a simple swap function in mips, what should be the order of the instructions so that the code works as intended?
Is there a general "rule" about how a function should be written? What do I do first and then what do I do after that? Also how does the computer memory interact with the registers?
and as I mentioned swap, below is an example I found, and in which I don't understand why each line is "useful" to the function, especially when it comes to (I think) storing the vector(?)
If at least someone could explain the lines 5-10 (starting from sll) that would really help.
Sorry for asking so many questions at once, I am just really confused.
swap: #swap method
addi $sp, $sp, -12 # Make stack room for three
sw $a0, 0($sp) # Store a0
sw $a1, 4($sp) # Store a1
sw $a2, 8($sp) # store a2
sll $t1, $a1, 2 #t1 = 4a
add $t1, $a0, $t1 #t1 = arr 4a
lw $s3, 0($t1) #s3 t = array[a]
sll $t2, $a2, 2 #t2 = 4b
add $t2, $a0, $t2 #t2 = arr 4b
lw $s4, 0($t2) #s4 = arr[b]
sw $s4, 0($t1) #arr[a] = arr[b]
sw $s3, 0($t2) #arr[b] = t
addi $sp, $sp, 12 #Restoring the stack size
jr $ra #jump back to the caller
CodePudding user response:
Is there a general "rule" about how a function should be written? What do I do first and then what do I do after that? Also how does the computer memory interact with the registers?
Yes, mostly — to write a function in assembly, we write:
- entry point — the function label
- function prologue — stack allocation & register preservation (when needed)
- function body — your function's algorithm
- function epilogue — register restoration & stack deallocation (when needed)
- finally, to return to caller
The function must follow the rules of the Calling Convention, which is part of the Application Binary Interface, ABI.
Those rules tell us:
- how parameters are passed by the caller to the callee, and thus, where they can be expected to be found at the entry point (i.e. before the first instruction of the function executes).
- how return values are passed by callee back to the caller, and thus where the caller can expect to find them
- how the callee knows what caller to return to
- which registers must be restored to their original values before returning to caller, if changed
- which registers we can relied upon being preserved for a caller making a call (this is the same register set as for 4.)
- which registers can be repurposed in a callee without needing to save/restore
- which registers a caller cannot rely on remaining unmodified after a call (this is the same set as 6.)
For MIPS:
Parameters are passed in argument registers, $a0
, $a1
, $a2
, $a3
.
Return values are passed in value registers, $v0
, $v1
.
The return address (or by older term, linkage) is a pointer parameter that tells the callee where to return in order to go to the right caller. This is a pointer to code that generally refers to the instruction in the caller immediately after the call. The callee expects to find this value in the $ra
register. You don't ever see the return address in C but is revealed in assembly.
The stack pointer offers stack storage, and the rules of the stack on MIPS are:
- allocate storage before using it,
- deallocate exactly as much as allocated before returning to caller
- allocation is done by subtracting from the stack pointer
- the memory from the location referred to by the stack pointer after allocation, and up to where it was before, is your new memory.
- it should go without saying but, don't write to stack memory you didn't allocate.
As far as the function body is concerned, if you have an algorithm for the function, then the body should follow that algorithm. If the algorithm is written in a high level language, then you should know that every structured (control) statement, like if-then, if-then-else, while, do-while, for, is a pattern (of expressions an nested statements) that has an equivalent pattern in assembly language. The same for expressions: expressions can be decomposed into their piece parts and executed in machine code.
Also how does the computer memory interact with the registers?
At the machine code level, the processor offers physical storage. This includes the CPU registers, and main memory. The registers are fast, directly inside the CPU and so machine code instruction directly operate on them. Main memory is outside the CPU but vast. Main memory is addressable but CPU registers are not. Any data structure that requires referencing or indexing must be stored in main memory — which is fine because there are really not enough registers to store more than a very small array. Data structures that have these requirements include: strings, trees, linked lists, arrays, so are allocated within main memory.
While we're at it, lets also note that code is stored in main memory, and that means that every machine code instruction of a machine code program has a unique memory address. Using this property, we can create pointers that refer to specific instructions, and that is used by the notion of the return address.
Machine code programs (whether written by assembly programmers, or translated by compilers), transfer data from main memory to registers and back — on MIPS, using load and store instructions. They use the registers to accomplish calculations, and main memory to store data structures.
This function allocates stack space and stores parameters there but doesn't make any use further use of that memory, so this is a waste, but harmless. The function does deallocate the allocated space before returning, as appropriate, given it allocated in the first place (which wasn't really needed here).
The lines starting with sll
do array indexing. They use the $s
registers in violation of the calling convention — these registers are allowed to be used but only if their incoming values preserved upon return, which here they are not so that is the violation.
Indexing a word (an int
) array requires manipulating byte offsets. A first element of a word array is at address, A, for example, and a second word at address A 4, because the first element takes 4 bytes, each of which also has an address. So, the formula for indexing a word array is given as A i*4, which is what that code is doing. It is indexing twice (e.g. computing the addresses for A[i], A[j]), loading from those addresses, and then storing back the values to where the other came from — which is a swap operation.
Here is the function in C:
void swap ( int *A, int i, int j ) {
int *ai = &A[i]; // compute a pointer indexing A[i]
int tempi = *ai; // copy A[i]'s value from memory into local variable
int *aj = &A[j];
int tempj = *aj;
*ai = tempj; // store tempj (A[j]'s original value) back to A[i]
*aj = tempi; // and tempi (A[i]'s original value back to A[j]
}
I have shown it with the common subexpression elimination where the address computation of A[i] and A[j] is used to both read and write the array elements, as has been done in the assembly.
A
is in $a0
, i
in $a1
, and j
in $a2
upon function entry, as per the calling convention, which callers will follow.