Home > OS >  Attempting to make my own loader, but cannot implement the data section
Attempting to make my own loader, but cannot implement the data section

Time:02-27

I am trying to implement my own binary loader for learning purposes, but cannot figure out the data segment.

section .data
    helloworld db "hello world", 10

section .text
    global _start

test: ;just for testing
    ret

_start:
    call test

    mov rax, 1
    mov rbx, 1
    mov rcx, helloworld
    mov rdx, 11
    syscall
    
    mov rax, 60
    mov rdi, 0
    syscall

This is my assembly program that I am trying to run. I compiled with nasm -f elf64 test.s -o test.o && ld test.o -o test.bin

My loader looks like this:

int main(int argc, char** argv) {
  char* bin = argv[1];
  struct ElfLib lib = read_elf(bin); //just reading the elf library into the default structures (Elf64_Ehdr, Elf64_Phdr, etc...)
  
  unsigned char* exec = mmap(NULL, DEFAULT_MEM_SIZ, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); //allocating the virtual memory
  memset(exec, 0, DEFAULT_MEM_SIZ);

  for (int i = 0; i < lib.elf_header.e_phnum; i  ) {
        Elf64_Phdr phdr = lib.program_headers[i];

        fseek(lib.execfile, phdr.p_offset, SEEK_SET);
        switch (phdr.p_type) {
            case PT_LOAD: {
                //load the memory at the file offset into the virtual address of exec
                fread(exec   phdr.p_vaddr, sizeof(unsigned char), phdr.p_memsz, lib.execfile);
                break;
            }
        }

        int flags = PROT_NONE;

        #define HASFLAG(flag) if (phdr.p_flags & flag) flags|=flag

        HASFLAG(PROT_EXEC); //execute flag on
        HASFLAG(PROT_WRITE); //write flag on
        HASFLAG(PROT_READ); //read flag on
        
        mprotect(exec   phdr.p_vaddr, phdr.p_memsz, flags);
  }

  void (*ex)() = (void*)(exec   lib.elf_header.e_entry);
  ex(); //call the _start function in the virtual memory
}

But when I run it, nothing gets printed.

I tried running it under GDB, and the program promptly exits after the exit syscall, with mov rax, 60 and mov rdi, 0, so I know the system call part works. I think that the issue is in the address of helloworld in the hello world program. GDB says that it is still under address 0x402000, which probably is not the same address under the virtual memory. Surprisingly, the test function is at 0x401000 with objdump, but at a completely different one when running with GDB, which does get called. Does anyone have an idea on how to go about implementing this?

I'm not sure how much this will help, but I'm running using x64 Linux under intel.

CodePudding user response:

nasm -f elf64 test.s -o test.o
ld test.o -o test.bin

Unfortunately, I don't have NASM, but if I use GNU assembler instead of NASM, the lines above result in a position-dependent file.

This means that phdr.p_vaddr does not specify a value that is relative to the variable exec, but phdr.p_vaddr specifies an absolute address that must not be changed.

Assuming the symbol helloworld is located at the start of the data segment, the instruction mov rcx, helloworld will simply load the value phdr.p_vaddr into the register rcx - and not the value exec phdr.p_vaddr.

However, because the address phdr.p_vaddr may already be used, you cannot simply load your code there!

The only possibility that you have if you want to load code from an already running program is so-called "position independent code" that can be loaded at different addresses in memory...

By the way:

64-bit x86 Linux does not take the parameters in rbx, rcx and rdx, but in rdi, rsi and rdx.

  • Related