I am trying to detect an EOF character, or just any character at all, but it doesn't work, no error either.
section .data
file db "text.txt", 0
section .bss
char resb 1
section .text
global _start
_start:
mov rax, 2
mov rdi, file
syscall
mov rbx, rax
mov rdi, rbx
mov rax, 0
mov rsi, char
mov rdx, 1
syscall
mov rcx, char
cmp rcx, -1
je _endOfFile
call _end
_endOfFile:
print 1, file, 0
ret
_end:
mov rax, 3
mov rdi, rbx
syscall
mov rax, 60
mov rdi, 0
syscall
I expected it to print the name of the file, but it doesn't do anything. When I remove the cmp, and just make it jump it prints it fine. I also tried it for other characters and it didn't work for those either. I am really new to assembly, so I have no clue what to do.
CodePudding user response:
Okay, a few layers of problems here.
Most fundamental is that there is no "EOF character". Unlike ISO C's getc()
, the Unix read
system call doesn't signal end-of-file by reading back a particular character, it signals it by returning 0 as its return value. So you need to check the value in rax
after the read syscall
. If it is zero, then you have reached end-of-file. If it is 1, then you successfully read a character into the memory location char
. If it is a smallish negative number, then an error occurred, and the negation of this value is an errno
code.
The comparison code also has a few bugs. First of all, mov rcx, char
doesn't load the character from char
, it loads the address of char
, which naturally does not equal -1. If you look, this is exactly similar to the mov rsi, char
you used to set up the system call, which likewise put the address of char
into rsi.
To specify the contents of memory at location char
, you use square brackets: mov rcx, [char]
. However, that wouldn't be right either. On x86-64, most instructions can operate
on 8, 16, 32 or 64 bit operands. When at least one operand is a register, the size of the specified register dictates the operand size. So mov rcx, [char]
would load 8 bytes, of which the lowest would be the byte from char
, and the other 7 would be whatever garbage happened to follow it in memory.
To load one byte, use an 8-bit register, like cl
. Then you need to likewise do the compare with only the 8-bit register, or else you're comparing against stuff that is not your character.
mov cl, [char]
cmp cl, -1
je got_ff
Though actually, in most cases, instead of mov cl, [char]
it would be better to do movzx ecx, byte [char]
which zeros out the upper bits of rcx
. mov cl, [byte]
is defined as preserving those bits, which comes with a slight performance cost.
But actually actually, you don't need to load the character into a register at all; cmp
works fine with a memory operand.
cmp byte [char], -1
je it_was_ff
CodePudding user response:
To open and read a file in x86 assembly, you can use the syscall instruction to call the open and read system calls provided by the operating system. Here is an example of how you can open and read a file in x86 assembly:
section .data
file db "text.txt", 0
section .bss
char resb 1
section .text
global _start
_start:
; Open the file
mov rax, 2
mov rdi, file
syscall
mov rbx, rax
; Read from the file
mov rdi, rbx
mov rax, 0
mov rsi, char
mov rdx, 1
syscall
; Check for end of file
mov rcx, char
cmp rcx, -1
je _endOfFile
; Close the file
mov rax, 3
mov rdi, rbx
syscall
; Exit
mov rax, 60
mov rdi, 0
syscall
_endOfFile:
; Close the file
mov rax, 3
mov rdi, rbx
syscall
; Exit
mov rax, 60
mov rdi, 0
syscall
In this code, the file variable is defined as a string containing the path to the file you want to open. The char variable is a buffer where the read data will be stored.
The _start label is the entry point of the program. In the _start routine, the open system call is called using the syscall instruction to open the file specified by the file variable. The file handle is saved in the rbx register.
Next, the read system call is called using the syscall instruction to read data from the file into the char buffer. The char buffer is then checked to see if it contains the end of file marker (-1), and if it does, the _endOfFile routine is called to close the file and exit the program.
If the char buffer does not contain the end of file marker, the _end routine is called to close the file and exit the program.
I hope this helps! Let me know if you have any other questions.
-chatgpt