I have been given an assignment to make a 2 pass assembler for 8086
. I wish to keep things simple and only assemble small programs for now. I found that the .COM
format is very simple. However I cannot find the specifics of the file format.
Also I read that execution always begins at 100h
. So won't it be a problem if MS-DOS
(actually DOSBOX
in my case) has system programs already present there? And Do I need to provide some default stub code in the 0h
-100h
part?
I simply want to know how will I write a .COM
file that is runnable on DOSBOX
.
CodePudding user response:
The .COM
format has no structure.
The program is loaded to address 100h
into some segment. Before that, you'll find the PSP for your program. The last usable word in the segment (usually at fffeh
) will be overwritten with 0000h
and the stack pointer pointed to it. This allows you to exit the program with a ret
instruction.
The program starts by setting all of CS
, DS
, ES
, and SS
to the segment of your program. Then, the DOS kernel jumps to address 0100h
(i.e. the start of your program) to run it.
That's really all there is to it.
Note that because the DOS kernel picks a segment for your program, it won't collide with any already loaded programs or the DOS kernel. It'll just work as expected.
As for writing COM programs, I recommend using an assembler like NASM with output format “binary” (i.e. no output format). The general template is this:
org 100h # Tell NASM that the binary is loaded to 100h
start: ... # the program starts here. This must
# be the first thing in the file.
# place any variables or constants after the code
Then assemble with
nasm -f binary -o program.com program.asm
For more information, this resource might be helpful to you.