How should BMP header look like?-CodePudding

I am trying to write a code to read a BMP file so I am trying to read BMP header(s). I am using a test image from some code library but it looks, like the structure doesn't correspond with the information about BMP image structure on Wikipedia Wikipedia BMP structure , especially in part, where should be offset stored ("The offset, i.e. starting address, of the byte where the bitmap image data (pixel array) can be found. ") in the table on Wikipedia as the start of the image looks like following:

00000000: 424d ea88 0000 0000 0000 3604 0000 2800  BM........6...(.
00000010: 0000 e300 0000 9500 0000 0100 0800 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0001  ................
00000030: 0000 0000 0000 2c2a 2c00 4c8f 6900 252e  ......,*,.L.i.%.
00000040: 9900 385b 4700 4c54 9700 98c9 a000 7a8c  ..8[G.LT......z.

So there is clear that the image starts on 36h and not 436h as is should look like according to Wikipedia - looks for me like the valid information has 1 Byte, not 4 Bytes. So I tried to find another source of the information about the header and I've found only the same information as is described in the mentioned article.

I thought that I have wrong image stored, so I decided to open it via Gimp and store it as the new-one, but it looks like the header has the same structure, just start of the image is moved.

00000000: 424d 2e89 0000 0000 0000 7a04 0000 6c00  BM........z...l.
00000010: 0000 e300 0000 9500 0000 0100 0800 0000  ................
00000020: 0000 b484 0000 130b 0000 130b 0000 0001  ................
00000030: 0000 0001 0000 4247 5273 0000 0000 0000  ......BGRs......
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 0000 0000 0000 0000 0000 0200 0000 0000  ................
00000070: 0000 0000 0000 0000 0000 2c2a 2c00 4c8f  ..........,*,.L.
00000080: 6900 252e 9900 385b 4700 4c54 9700 98c9  i.%...8[G.LT....

So not only that Gimp reads the header with no issue, but it even creates header of the same type. But according to the documentation it looks like the binary part should be stored on 436h, respective 47Ah as there should be little endian used. I am clearly missing something there but I can't get the point as the different sources fastgraph.com BMP header docs.fileformat.com BMP structure shows the same offsets but reality differs here for some reason. Can you move me forward here?

PS: I would paste here original image, but it looks like the image is automatically exported to png format.

CodePudding user response：

tl;dr: In your image there's a color table. When there's a color table, it comes after the DIB header, and the pixel data is after that. Since the color table runs from 0x36 to 0x435, the pixel data itself is indeed at 0x436.

The last time I used it, the Wikipedia article on the BMP format was correct and more thorough and better presented than most other sources.

A BMP file has a file header and then a bitmap info header (sometimes called the DIB header). The pixel data might start immediately after the info header, but there may be additional color information and/or padding. The pixel data is after that.

Based on your first hex dump, you are correct that the file header says the pixel data starts at 0x0436. You don't show enough of the file to see whether the pixel data actually starts at 0x0436, but we can inspect the DIB header to see if it's consistent.

The DIB header is more complex because there are many versions. Starting at offset 0x0E, we can read that size of the DIB header is 0x28 (or 40 decimal). That tells us it's a vanilla BITMAPINFOHEADER.

The bitmap size is 227x149 pixels. (Watch out, because odd widths can be tricky to get just right.)

The pixel format is 8 bits per pixel and BI_RGB is the compression value. So there will be a color table (palette) after the headers and before the pixel data. In fact, the header explicitly says there will be 256 color table entries.

So the color table starts at 0x36. Each entry in the color table is four bytes (RGBQUAD), so the table length is 256 colors * 4 bytes/color = 1024 bytes. Thus the table runs from 0x36 to 0x435.

The pixel data can come anywhere after the color table as long as it's aligned to a 4-byte boundary. Since 0x0436 is a multiple of 4, that's the first possible address available. It's also exactly where the file header said the pixel data would be.

(When you saved the image with Gimp, Gimp chose to save it using a different version of the bitmap info header that's 0x6C bytes long and it uses a different pixel format, so it's not surprising that the pixel data starts at a larger offset.)

CodePudding user response：

Thanks for all replies!

After all responses it looks that I did mistake because I have thought that start of bitmap palette is start of the image, because the color on the start looked like the color in the first pixel of the image. All replies were useful for me as they helped me understand the topic better. After there was comment from Martin Rosenau read, which helped me to understand my mistake, was also helpful http://libertybasicuniversity.com/lbnews/nl100/format.htm article.