I have a hello world program:
.global _start
.text
_start:
# write (1, msj, 13)
mov $1, %rax # system call 1 is write
mov $1, %rdi # file handler 1 is stdout
mov $message, %rsi # address of string to output
mov $13, %rdx # number of bytes
syscall
# exit(0)
mov $60, %rax # system call 60 is exit
xor %rdi, %rdi # we want to return code 0
syscall
message:
.ascii "Hello, world\n"
I can assemble this into an object file with:
as hello.s -o hello.o
This object file is not executable. When I try to execute it, I get:
bash: ./hello.o: cannot execute binary file: Exec format error
I need to invoke the linker to make this viable:
ld hello.o -o hello
At this point, the hello
program works. However, the use of the linker here is confusing to me.... I'm not linking in any external libraries! I seem to just be linking the object file to nothing.
What is the linker doing for such a "self-contained" program?
CodePudding user response:
ELF files have different types, like ELFTYPE_EXEC (traditional non-PIE executable) or ELFTYPE_REL (relocatable object file, normally with a .o
filename).
as
doesn't have a special-case mode that outputs an executable instead of an object file. There are other assemblers, or at least one: FASM, that do have a special mode to output an ELF executable directly.
Given the ELF object file that as
produces, you could:
- link it into a simple static executable like you're doing
- link it into a PIE executable
- link it into a dynamic executable, possibly even one that links some
.so
shared libraries; those could have static constructors (init functions) that run before your_start
. (For example glibc'slibc.so
does this, which is why it happens to work to call libc functions from_start
on Linux without manually calling glibc init functions, if you dynamically link.)
The .o
needs to be linked because no absolute address has been chosen for it to be loaded at, to fill in things like your 64-bit absolute immediate in mov $message, %rsi
.
(If you'd use lea message(%rip), %rsi
the code would be position-independent but the distance between the .text
and .rodata
sections wouldn't be known yet. Although you put your string right in .text
so that would get resolved at assemble time if you hadn't chosen the least efficient way to get an address into a register, so that would give you a stand-alone block of code data. But the most efficient way, mov $message, %esi
, would also need an absolute (32-bit) address.)
as
doesn't know what you want to do, and GNU Binutils was primarily written for use by compiler back-ends, so there was no point making as
more complicated to be able to write an ELF-type EXEC
file directly since that's what ld
is for. This is the Unix philosophy of making small separate tools that do one thing well.
If you want to assemble link with one command, make a shell script, or use a compiler front-end:
gcc -nostdlib -static -no-pie start.s -o static_executable