difference between simple ret and _exit function in nasm x8664-CodePudding

I have been in pain for the last couple of days with x8664 assembly (using nasm on macOs). I’d like to show two pieces of code

So let’s say that I have an array and I want to print it. This is the code that I got,

_main:
   push rbp
   mov rbp, rsp
   
   lea rdi, array
   mov rsi, length
   call _printy
   
   pop rbp
   mov rdi, 0
   call _exit

_printy:
    push rbp    
    mov rbp, rsp
    sub rsp, 16
    
    xor rbx, rbx
    mov r12, rdi
    mov r15, rsi
    
    .loop:
        cmp rbx, r15
        jge .done   
        
        lea rdi, format 
        mov rsi, [r12   rbx * 4] 
        xor rax, rax
        call _printf

        add rbx, 1  
        jmp .loop

    .done:
        leave
        ret

brief explanation:

pass array in rdi, and length in rsi
call _printy
create stack frame 16byte aligned for printf call, save data in the preserved registers.
loop until condition holds and leave
then exit

And this is the painful part consider the following code for the exit portion

call    printy(char const*)
        mov     eax, 0
        pop     rbp
        ret

this is the code generated by the compiler explorer. If I try to replicate this, printf fails miserably. Stepping with the debugger I get:

thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xe)
    frame #0: 0x0000000100015501 dyld`start   465
dyld`start:
->  0x100015501 < 465>: mov    rax, qword ptr [rbx   0x8]
    0x100015505 < 469>: mov    edi, dword ptr [rax   0x34]
    0x100015508 < 472>: xor    esi, esi
    0x10001550a < 474>: call   0x100041a26               ; dyld3::MachOFile::isSimulatorPlatform(dyld3::Platform, dyld3::Platform*)

I personally cannot understand the reason behind this crash, nor the success using the _exit function, so this is the first doubt someone could clear for me.

My second question is about register sizes. Is this code better than the first one?

_main:
    push rbp
    mov rbp, rsp
    
    lea rdi, array 
    mov esi, length
    call _printy    

    pop rbp
    mov rdi, 0
    call _exit

_printy:
    push rbp    
    mov rbp, rsp
    sub rsp, 16
    
    xor ebx, ebx
    mov r12, rdi
    mov r15d, esi
    
    .loop:
        cmp ebx, r15d
        jge .done   
        
        lea rdi, format 
        mov esi, [r12   rbx * 4] 
        xor rax, rax
        call _printf

        add ebx, 1  
        jmp .loop

    .done:
        leave
        ret

as you can see I am loading in esi instead of rsi, using ebx and r15d. The line I am interested in is mov esi, [r12 rbx * 4] is it better using esi (I am working with an array of integers)?Or using rsi does not make a lot of difference.By using esi, I am storing into the lower 4 bytes of the rsi register, while using rsi itself I am consuming all 8 bytes.Is there some kind of performance hit? At the end of the day I guess the number will be zero extended and, whether you save 4 or 8 bytes, the rsi register cannot hold another integer. And what about this similar operation movsx rsi, dword[r12 rbx * 4] .Now I am explicit ( I am saying that I want to store 4 bytes, and I am sign extending.Is this better than using directly the 4 byte register?Or It is pretty similar to rsi, [r12 rbx * 4] ? If you can clear just some of my doubts I would be really happy. Thanks for the attention.

CodePudding user response：

I'll answer the question in the title:

difference between simple ret and _exit function

ret is used to return to the caller, while _exit is a request that the operating system terminate the program immediately, so it will execute no further instructions.

When can you use ret vs. calling exit()? You can call exit() at any point, and no matter how deep the call chain is, and it will terminate the program without giving normal function callers a chance to execute or clean up at all. This would be a somewhat forceful termination if there are more things on the call stack than just main, but it is not necessarily a logic error, if that's what the program should do, and, sometimes this is used as a simple approach to error handling (e.g. print an error message and halt the program).

ret can be used to return to a caller, assuming you've cleaned up the stack to the same as it was upon function entry — in other words, that the return address is the top thing on the stack. (You also have to restore the call-preserved registers, or else the caller may not work.)

main() is typically a function called by _start in crt0.o, where _start is the default program entry point.

However, _main (or any other (externally visible) symbol) can be set up directly as the program entry point, given the right linker options in the program's construction.

The program entry point is the first instruction to execute in a new program, and it is not invoked as a function by the operating system but rather it is merely transferred control, which means there is no return address (on the stack) and there are no parameters (at least, usually not following the regular calling convention). Because the program entry point is not invoked as a real function, it also cannot "return" to its caller (there is no caller of the program entry point). Therefore the program entry point — e.g. if _start gets control back by main() returning — must call exit() for proper program termination. (The same applies if an alternate program entry point is selected: it cannot return, would have to use exit() or some other way to terminate the program.)