I have been in pain for the last couple of days with x8664 assembly (using nasm on macOs). I’d like to show two pieces of code
So let’s say that I have an array and I want to print it. This is the code that I got,
_main:
push rbp
mov rbp, rsp
lea rdi, array
mov rsi, length
call _printy
pop rbp
mov rdi, 0
call _exit
_printy:
push rbp
mov rbp, rsp
sub rsp, 16
xor rbx, rbx
mov r12, rdi
mov r15, rsi
.loop:
cmp rbx, r15
jge .done
lea rdi, format
mov rsi, [r12 rbx * 4]
xor rax, rax
call _printf
add rbx, 1
jmp .loop
.done:
leave
ret
brief explanation:
- pass array in rdi, and length in rsi
- call _printy
- create stack frame 16byte aligned for printf call, save data in the preserved registers.
- loop until condition holds and leave
- then exit
And this is the painful part consider the following code for the exit portion
call printy(char const*)
mov eax, 0
pop rbp
ret
this is the code generated by the compiler explorer. If I try to replicate this, printf fails miserably. Stepping with the debugger I get:
thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xe)
frame #0: 0x0000000100015501 dyld`start 465
dyld`start:
-> 0x100015501 < 465>: mov rax, qword ptr [rbx 0x8]
0x100015505 < 469>: mov edi, dword ptr [rax 0x34]
0x100015508 < 472>: xor esi, esi
0x10001550a < 474>: call 0x100041a26 ; dyld3::MachOFile::isSimulatorPlatform(dyld3::Platform, dyld3::Platform*)
I personally cannot understand the reason behind this crash, nor the success using the _exit function, so this is the first doubt someone could clear for me.
My second question is about register sizes. Is this code better than the first one?
_main:
push rbp
mov rbp, rsp
lea rdi, array
mov esi, length
call _printy
pop rbp
mov rdi, 0
call _exit
_printy:
push rbp
mov rbp, rsp
sub rsp, 16
xor ebx, ebx
mov r12, rdi
mov r15d, esi
.loop:
cmp ebx, r15d
jge .done
lea rdi, format
mov esi, [r12 rbx * 4]
xor rax, rax
call _printf
add ebx, 1
jmp .loop
.done:
leave
ret
as you can see I am loading in esi
instead of rsi
, using ebx
and r15d
.
The line I am interested in is
mov esi, [r12 rbx * 4]
is it better using esi (I am working with an array of integers)?Or using rsi does not make a lot of difference.By using esi, I am storing into the lower 4 bytes of the rsi register, while using rsi itself I am consuming all 8 bytes.Is there some kind of performance hit? At the end of the day I guess the number will be zero extended and, whether you save 4 or 8 bytes, the rsi register cannot hold another integer. And what about this similar operation movsx rsi, dword[r12 rbx * 4]
.Now I am explicit ( I am saying that I want to store 4 bytes, and I am sign extending.Is this better than using directly the 4 byte register?Or It is pretty similar to
rsi, [r12 rbx * 4]
? If you can clear just some of my doubts I would be really happy. Thanks for the attention.
CodePudding user response:
I'll answer the question in the title:
difference between simple
ret
and_exit
function
ret
is used to return to the caller, while _exit
is a request that the operating system terminate the program immediately, so it will execute no further instructions.
When can you use ret
vs. calling exit()
? You can call exit()
at any point, and no matter how deep the call chain is, and it will terminate the program without giving normal function callers a chance to execute or clean up at all. This would be a somewhat forceful termination if there are more things on the call stack than just main
, but it is not necessarily a logic error, if that's what the program should do, and, sometimes this is used as a simple approach to error handling (e.g. print an error message and halt the program).
ret
can be used to return to a caller, assuming you've cleaned up the stack to the same as it was upon function entry — in other words, that the return address is the top thing on the stack. (You also have to restore the call-preserved registers, or else the caller may not work.)
main()
is typically a function called by _start
in crt0.o, where _start
is the default program entry point.
However, _main
(or any other (externally visible) symbol) can be set up directly as the program entry point, given the right linker options in the program's construction.
The program entry point is the first instruction to execute in a new program, and it is not invoked as a function by the operating system but rather it is merely transferred control, which means there is no return address (on the stack) and there are no parameters (at least, usually not following the regular calling convention). Because the program entry point is not invoked as a real function, it also cannot "return" to its caller (there is no caller of the program entry point). Therefore the program entry point — e.g. if _start
gets control back by main()
returning — must call exit()
for proper program termination. (The same applies if an alternate program entry point is selected: it cannot return, would have to use exit()
or some other way to terminate the program.)