Home > Mobile >  What does it mean to have a "pixel is knocked off" in CNN?
What does it mean to have a "pixel is knocked off" in CNN?

Time:12-07

I'm reading a book where a section introduces how kernel works in CNN: https://freecontent.manning.com/deep-learning-for-image-like-data/.

Sliding a kernel over an image and requiring that the whole kernel is at each position completely within the image, yields to an activation map with reduced dimensions. For example, if you’ve a 3 x 3 kernel on all sides, one pixel is knocked off in the resulting activation map; in case of a 5 x 5 kernel, even two pixels.

What does it mean here to have one or two pixels that is knocked off?

CodePudding user response:

They mean, that without extra padding, using 3x3 kernel will "loose" one pixel per side in the output. So if your input image is NxN the output will be (N-2)x(N-2).

For example witn N=5 you can see that when the kernel "fits" into lower right corner its center is "one pixel off in both horizontal and vertical axes".

a a a a a           . . . . .
a a a a a           . b b b .
a a x x x    ===>   . b b b .
a a x X x           . b b B . 
a a x x x           . . . . .

 5 x 5                3 x 3

To avoid this issue various padding strategies are used, e.g. to "surround your picture" with 0s so that size is preserved

0 0 0 0 0 0 0            . . . . . . .
0 a a a a a 0            . b b b b b .
0 a a a a a 0            . b b b b b .
0 a a a a a 0     ===>   . b b b b b .
0 a a a x x x            . b b b b b .
0 a a a x X x            . b b b b B .
0 0 0 0 x x x            . . . . . . .

 5 x 5   pad                5 x 5
  • Related