I'm currently building a (very simple) OS, using NASM. I'm trying to write an assembly function that will read a byte from an address given as a function parameter, then return the data, like so:
// kernel.c
extern int readbyte();
int main(void) {
int x = readbyte(0xdeadbeef); //returns data stored at 0xdeadbeef
return 0;
}
; kernel_entry.asm
[bits 32]
[extern main]
[extern readbyte]
call main
readbyte:
; somehow read function parameter here and store it to ecx
mov eax, [ecx] ; read byte from given address
ret ; return the data to the C function
Returning data via the eax
register is working fine, but I can't figure out where the parameter data given by the C program (0xdeadbeef
) is or how I can access it. When I try to pop from the stack, the whole program just crashes. I've tried reading from various registers, but those don't ever match up with the parameter given in the C script.
Could somebody point me in the right direction?
Edit:
Following Michael Petch's suggestion, I wrote an assembly function that returns the first parameter:
readbyte:
mov ecx, [esp 4]
mov eax, ecx
ret
The issue is that when I test it in Qemu, the display rapidly flashes between the boot sequence screen and displaying the correct data. Does anybody know why it's crashing, yet also showing the proper data?
Edit 2:
So I discovered that the above code will run without entering a reboot loop if I pass a hexadecimal value less than 4 digits long. So 0xABC
will work properly, but 0xABCD
will crash the system.
CodePudding user response:
In modern operating systems, that depends on the ABI (the set of rules that specify how the interface with a separately compiled module is established) This assumes that some CPU registers are reserved for passing parameters (e.g. four of them) while the rest are pushed on the stack. Let's assume that the ABI establishes that the first four registers are used to pass function arguments, and that you are called with e.g. 6 arguments, the way a function call is made is to first push the last two arguments to the stack (in reverse order to how they are written in the code, so first the last argument, then the fifth, and then the four registers are filled with the rest of arguments, then a call is made to the routine address, which means that the return address is also pushed on the stack. Inside your routine, there's normally a function preamble code that consists in storing in a register the value of the stack pointer, by first pushing the value of this register to the stack in a form like the following:
push ebp
mov ebp, esp
This will allow you to mangle with the stack pointer (pushing local storage or doing calculations) without having to consider the SP position to access parameters or the like, when accessing the function parameters.
Also, local storage is reserved, by subtracting (at this point) a proper value from the stack pointer. This leads to the following stack:
| more stack... |
~~~~~~~~~~~~~~~~
| sixth param |
----------------
| fifth param |
---------------- (previous parameters are in cpu registers)
| return address|
----------------
| old value BP | <--- BP points here now.
----------------
| local vars |
. .
| | <--- SP points here at function entry, after preamble
----------------
| local stack |
So, you should use (in this case, for this function):
address | to access... |
---|---|
[EBP 12] |
sixth parameter |
[EBP 8] |
fifth parameter |
[EBP 4] |
return address |
[EBP 0] |
old linkage BP |
[EBP - 4] |
first local var |
But, as I say, you should conform to the ABI your compiler is using. This should be found with your compiler documentation (or at least, a reference to the document describing it)
Consider also that the offsets depend on the variable size and it’s alignment. This is important.
At the end, by the way, there’s also some epilogue code, to undo the changes of the prologue. If you have been conservative with the stack, it should point to the same value as the EBP register, so NO mov sp, ebp
will be needed, but
pop ebp
Will, to restore the old link register. This leaves the return address on top of the stack, and we can return. The calling procedure is responsible of adjusting the stack from this point on (popping the pushed stack registers used for the call)