how does the far jump work on real hardware (bootloader creation)-CodePudding

I am trying to develop my own OS (for learning purpose), everything is fine in qemu, it work well.

nasm -f bin os.s -o os
qemu-system-x86_64 -drive file=os,format=raw

So I decided to try it on real hardware and I encounter two issue : The first is that when I boot my pc with my usb key containing my OS (dd bs=4M if=os of=/dev/sdc), it start by printing "Invalid partition table!". But then, if i press any key it execute correctly the code and it start booting my os by printing "VOK" on the screen and it stop there.

Does anyone know from where the "Invalid partition table!" could come from ?

second question: at the end, I try to perform a far jump to exit the 16bits mode and enter 32bits mode. As you can see I currently stop the program just before with a infinity loop (jmp $). when I pass the loop just after the far jump, in the init_32 label, its working fine with qemu but doesn't work with my laptop (he reboot again and again meaning the code crashed). I have no idea why this is happening, some help would be helpful!

[org 0x7c00]

SECTOR_READ equ 0x08
; memory offset where our kernel is located
KERNEL_OFFSET equ 0x7e00

; this is the boot loader which will be present on the MBR
; sector 1, 512 bytes

mov ah, 0x0e ; tty mode

mov bp, 0x9000 ; this is an address far away from 0x7c00 so that we don't get overwritten
mov sp, bp ; if the stack is empty then sp points to bp

mov al, 'V'
int 0x10

; let's load our OS loader

; load 'dh' sectors from drive 'dl' into ES:BX
mov bx, KERNEL_OFFSET;

mov ah, 0x02 ; ah <- int 0x13 function. 0x02 = 'read'
mov al, SECTOR_READ   ; al <- number of sectors to read (0x01 .. 0x80)
mov cl, 0x02 ; cl <- sector (0x01 .. 0x11)
                ; 0x01 is our boot sector, 0x02 is the first 'available' sector
mov ch, 0x00 ; ch <- cylinder (0x0 .. 0x3FF, upper 2 bits in 'cl')
; dl <- drive number. Our caller sets it as a parameter and gets it from BIOS
; DONT TOUCH
; (0 = floppy, 1 = floppy2, 0x80 = hdd, 0x81 = hdd2)
mov dh, 0x00 ; dh <- head number (0x0 .. 0xF)

; [es:bx] <- pointer to buffer where the data will be stored
int 0x13      ; BIOS interrupt
jc error ; if error (stored in the carry bit)

cmp al, SECTOR_READ    ; BIOS also sets 'al' to the # of sectors read. Compare it.
jne error

mov ah, 0x0e ; tty mode
mov al, 'O'
int 0x10
mov al, 'K'
int 0x10
mov al, 0x0a ; newline char
int 0x10
mov al, 0x0d ; carriage return
int 0x10

jmp switch_64bits

error:
    mov ah, 0x0e ; tty mode
    mov al, 'K'
    int 0x10
    mov al, 'O'
    int 0x10
    mov al, 0x0a ; newline char
    int 0x10
    mov al, 0x0d ; carriage return
    int 0x10
    jmp $

gdt_start: ; don't remove the labels, they're needed to compute sizes and jumps
    ; the GDT starts with a null 8-byte
    dd 0x0 ; 4 byte
    dd 0x0 ; 4 byte

; GDT for code segment. base = 0x00000000, length = 0xfffff
; for flags, refer to os-dev.pdf document, page 36
gdt_code: 
    dw 0xffff    ; segment length, bits 0-15
    dw 0x0       ; segment base, bits 0-15
    db 0x0       ; segment base, bits 16-23
    db 10011010b ; flags (8 bits)
    db 11001111b ; flags (4 bits)   segment length, bits 16-19
    db 0x0       ; segment base, bits 24-31

; GDT for data segment. base and length identical to code segment
; some flags changed, again, refer to os-dev.pdf
gdt_data:
    dw 0xffff
    dw 0x0
    db 0x0
    db 10010010b
    db 11001111b
    db 0x0

gdt_end:

; GDT descriptor
gdt_descriptor:
    dw gdt_end - gdt_start - 1 ; size (16 bit), always one less of its true size
    dd gdt_start ; address (32 bit)

; define some constants for later use
CODE_SEG equ gdt_code - gdt_start
DATA_SEG equ gdt_data - gdt_start

[bits 16]
switch_64bits: ; load 32 bit first, then 64 bits

    ; clear all interrupts
    cli

    ; load our Global Descriptor Table
    lgdt [gdt_descriptor]

    ; switch to protected mode
    ; set PE (Protection Enable) bit in CR0
    ; CR0 is a Control Register 0
    mov eax, cr0
    or al, 0x1
    mov cr0, eax

    ; far jump to 32 bit instructions
    ; so we can be sure processor has done
    ; all other operations before switch
    ; at this moment we can say bye to 16-bit Real Mode

    jmp $

    jmp CODE_SEG:init_32

[bits 32]
init_32:

; padding and magic number
times 510 - ($-$$) db 0
; Magic number
dw 0xaa55

times 256 dw 0x0202 ; sector 2 = 512 bytes 0x7e00
times 256 dw 0x0303 ; sector 3 = 512 bytes 0x8000
times 256 dw 0x0404 ; sector 4 = 512 bytes 0x8200
times 256 dw 0x0505 ; sector 5 = 512 bytes 0x8400
times 256 dw 0x0606 ; sector 6 = 512 bytes 0x8600
times 256 dw 0x0707 ; sector 7 = 512 bytes 0x8800
times 256 dw 0x0808 ; sector 8 = 512 bytes 0x8a00
times 256 dw 0x0909 ; sector 9 = 512 bytes 0x8c00

CodePudding user response：

Does anyone know from where the "Invalid partition table!" could come from ?

When USB flash was first introduced there was no standard to determine how booting from USB flash is supposed to work (unlike CD where a proper standard was created). With no standard of its own, firmware was left with 2 choices: emulate a floppy disk (even though it's large like a hard disk) or emulate a hard disk (even though its removable like a floppy disk). Some motherboards put a setting in BIOS configuration to control what happens. Eventually most/all firmware decided to try to auto-detect what it should do using glorious unspecified non-standard shenanigans.

Then UEFI got introduced and added the possibility of GPT partitions, and the possibility that there's a FAT partition with a UEFI boot loader (and that legacy BIOS shouldn't be used).

The end result is something like:

if it seems like there's a BPB and no partition table; then it should emulate a floppy disk with legacy BIOS
if it seems like there's a partition table in the MBR; check if it's a protective MBR and if there's a GPT partition table too.
if there's a partition table (of either type - MBR or GPT), use it to check for a UEFI partition with a FAT file system that contains a UEFI boot loader; and if there is then do UEFI boot (no emulation, no legacy stuff)
if there's a partition table (of either type) but no UEFI boot loader; then emulate hard disk for legacy BIOS
if none of the above worked, make a unspecified default choice (either emulate floppy with legacy BIOS, or emulate hard drive with legacy BIOS, or assume that the device can't be booted from at all)

Of course "glorious unspecified non-standard shenanigans" can include anything, and can include displaying a warning message when nothing makes sense (e.g. when there's nothing that seems like it might be a BPB and also nothing that seems like it might be a partition table).

Note that "glorious unspecified non-standard shenanigans" can also include unexpected dodgy nonsense; like if there's MBR partitions, assuming code in the MBR merely chain-loads a boot loader from the active partition and firmware booting the 1st sector of the active partition directly without executing any code in the MBR; and like "correcting" (corrupting) fields in an assumed BPB that doesn't look like a BPB at all.

Also be warned that (in an abundance of pure arrogance) Microsoft decided that it's fine if their OS writes a 32-bit "disk signature" at offset 0x01B8 in the MBR of disks that belong to other people's operating systems (that they have no right to touch in any way whatsoever), and that this can also corrupt an operating system's boot code, and (if an OS uses it) it can ruin TPM (where code used during boot, including MBR, is "measured" by firmware before its used, for security purposes - for things like remote attestation and establishing disk encryption keys).

To guard against all of the above, you'll want to pick a clearly defined case ("floppy emulation, legacy BIOS", "hard disk emulation, legacy BIOS" or "UEFI boot"); and do everything you can to steer the firmware's "glorious unspecified non-standard shenanigans" towards the correct choice (e.g. if you chose "hard disk emulation, legacy BIOS"; then fill the area where a BPB would be with zeros to reduce the chance of firmware thinking it might be a valid BPB and choosing "floppy emulation"); and then assume anything in the area where the BPB would be (from offset 0x000B to at least 0x003F inclusive) and the 4 bytes at offset 0x01B8 may be corrupted and not use these bytes.

Of course you'll also want the dw 0xaa55 in the last 2 bytes, and a jmp in the first few bytes; as these may be used (and might not be used) by firmware to decide if the disk is bootable.

when I pass the loop just after the far jump, in the init_32 label, its working fine with qemu but doesn't work with my laptop (he reboot again and again meaning the code crashed)

Possible causes include:

Firmware corrupting your code (described above)
Windows corrupting your code (described above)
Firmware skipping MBR code and booting 1st sector of a partition that doesn't exist (mentioned above)
Your code making assumptions about the undefined value BIOS felt like leaving in ES, loading a sector at this undefined address (that depends on the value left in ES), then either corrupting itself (because the undefined address is where your code or stack is) or doing a far jump to random garbage at a defined address that isn't where the sector was loaded.
Your code making assumptions about the undefined value BIOS felt like leaving in SS and SP, loading a sector at a "possibly accidentally correct" address that happens to be where the stack is, then crashing (e.g. BIOS int 0x13 code returning to a corrupted address because the stack was trashed).

CodePudding user response：

On fixed media, you have to provide a Master Boot Record (MBR) in the boot sector as well.