Home > database >  What are the details of .com file format?
What are the details of .com file format?

Time:12-23

I have been given an assignment to make a 2 pass assembler for 8086. I wish to keep things simple and only assemble small programs for now. I found that the .COM format is very simple. However I cannot find the specifics of the file format.

Also I read that execution always begins at 100h. So won't it be a problem if MS-DOS(actually DOSBOX in my case) has system programs already present there? And Do I need to provide some default stub code in the 0h-100h part?

I simply want to know how will I write a .COM file that is runnable on DOSBOX.

CodePudding user response:

The .COM format has no structure.

The program is loaded to address 100h into some segment. Before that, you'll find the PSP for your program. The last usable word in the segment (usually at fffeh) will be overwritten with 0000h and the stack pointer pointed to it. This allows you to exit the program with a ret instruction.

The program starts by setting all of CS, DS, ES, and SS to the segment of your program. Then, the DOS kernel jumps to address 0100h (i.e. the start of your program) to run it.

That's really all there is to it.

Note that because the DOS kernel picks a segment for your program, it won't collide with any already loaded programs or the DOS kernel. It'll just work as expected.

As for writing COM programs, I recommend using an assembler like NASM with output format “binary” (i.e. no output format). The general template is this:

        org     100h            # Tell NASM that the binary is loaded to 100h

start:  ...                     # the program starts here.  This must
                                # be the first thing in the file.

        # place any variables or constants after the code

Then assemble with

nasm -f binary -o program.com program.asm

For more information, this resource might be helpful to you.

  • Related