Home > Enterprise >  Why can't my HELLO_WORLD string be loaded from section .data?
Why can't my HELLO_WORLD string be loaded from section .data?

Time:11-08

I am in the process of making a bootloader as a way for me to learn assembly. I have looked into using sections to organize and optimize my code, but one thing that doesn't work is when I call my printf function. When I have my HELLO_WORLD string inside the section .data, it doesn't want to load the string at all

; Set Code to run at 0x7c00
org 0x7c00
; Put into real mode
bits 16 

; Variables without values
section .bss

; Our constant values
section .data
    HELLO_WORLD: db 'Hello World!', 0

; Where our code runs
section .text
    _start:
        mov si, HELLO_WORLD ; Moves address for string into si register
        call printf ; Calls printf function
        jmp $ ; Jump forever
        
    printf:
        lodsb ; Load the next character
        cmp al, 0 ; Compares al to 0
        je _printf_done ; If they are equal...
        call print_char ; Call Print Char
        jmp printf ; Jump to the loop
    _printf_done:
        ret ; Return
    
    print_char:
        mov ah, 0x0e ; tty mode
        int 0x10 ; Video interrupt
        ret ; Return

; Fills the rest of the data with 0
times 510-($-$$) db 0
; BIOS boot magic number
dw 0xaa55   

RESULT:

Booting into hard drive...

However, if I move the string outside of that and put it at the bottom of printf, it seems to work.

; Set Code to run at 0x7c00
org 0x7c00
; Put into real mode
bits 16 

; Variables without values
section .bss

; Our constant values
section .data

; Where our code runs
section .text
    _start:
        mov si, HELLO_WORLD ; Moves address for string into si register
        call printf ; Calls printf function
        jmp $ ; Jump forever
        
    printf:
        lodsb ;  Loads next character
        cmp al, 0 ; Compares al to 0
        je _printf_done ; If they are equal...
        call print_char ; Call Print Char
        jmp printf ; Jump to the loop
    _printf_done:
        ret ; Return
    
    print_char:
        mov ah, 0x0e ; tty mode
        int 0x10 ; Video interrupt
        ret ; Return

    HELLO_WORLD: db 'Hello World!', 0

; Fills the rest of the data with 0
times 510-($-$$) db 0
; BIOS boot magic number
dw 0xaa55   

RESULT:

Booting into hard drive...
Hello World!

Why is that?

CodePudding user response:

$ - $$ calculates position within the .text section, so you're padding .text out to 510 bytes the 2-byte signature. So the .data section ends up after the boot signature, not part of the boot sector.

I noticed this by looking at the file size: 525 bytes. Using a hexdump to see what went where:

$ nasm -fbin bad.asm
$ hd bad               # equivalent to hexdump -C
00000000  be 00 7e e8 02 00 eb fe  ac 3c 00 74 05 e8 03 00  |..~......<.t....|
00000010  eb f6 c3 b4 0e cd 10 c3  00 00 00 00 00 00 00 00  |................|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200  48 65 6c 6c 6f 20 57 6f  72 6c 64 21 00           |Hello World!.|

We see that the Hello World! ASCII bytes started at offset 512 within the file, so it's not part of the first 512-byte sector which the firmware will load when booting in legacy BIOS mode.

Flat binaries don't have sections or ELF or PE program segments, and will get loaded with everything having read write exec permission (with the CPU in real mode so there is no paging or segment permissions). It's probably easiest to think in terms of creating a flat binary, and where you're placing things within those first 512 bytes, not in terms of .data and .text sections of an executable.

You can put your section .bss after the dw 0xaa55, because space immediately after where your MBR gets loaded (linear address 0x7C00) tends to be free to use. Putting it after the boot signature in your source makes your source match how NASM will lay out the flat binary. Note that it won't be zero-initialized for you like .bss space is under a mainstream OS.


If you really wanted to use section directives and have some .rodata or .data after your code but before the boot signature, you'd need to do something other than $-$$.

Like maybe put labels at the start/end of each section so you can do totalsize equ (text_end-text_start) (data_end-data_start) / times (510-totalsize) db 0 / dw 0xaa55. But you'd have to do this in whichever section NASM would put last, otherwise you'd be pushing some sections out past the 512-byte boundary. Fortunately the file size provides an easy check for that.

You can control what order NASM lays out sections in a flat binary. That's kind of a special case for NASM; it's acting as a linker as well as an assembler, filling in symbol offsets not just making relocation entries. Use the attributes start=x and follows=y on the section directive the first time it appears for a new section. (Thanks @ecm for pointing that out.) But the default is already to order .text first, which is what you need since execution starts at the first byte of the MBR.


At first I assumed NASM would output the sections in order of first appearance into your flat binary, in which case the problem would be executing the db 'Hello World!', 0 as machine code.

Turns out that's not what NASM does; it puts the .text section first in the flat binary, even if section .data is first in the source.


BTW, your bootloader relies on some things that aren't guaranteed, and will fail on some BIOSes.

(Bochs is generally recommended for single-step debugging of bootloaders. Especially if you do anything with segmentation or switching to protected mode; GDB connected to Qemu doesn't know about segmentation the way Bochs does.)

  • Related