Currently I am studying about Computer Vision, and studying about Depthwise Convolution. (Not Depthwise Seperate Convolution!)

So, What I want to know is how does "depth_multiplier" argument works.

First of all,

When not using "depth_multiplier" and only use DepthwiseConv2D(kernel_size = (3,3)), the shape of the kernels are (3x3x3) and the output shape becomes (32x32x3)

and here is the question,

But when I am using "depth_multiplier",

DepthwiseConv2D(kernel_size = (3,3),depth_multiplier=4)

does the shape of the kernel become (3x3x3)x4? -> which means 4 of (3x3x3) kernels

and the output shape becomes (32x32x3)x4? OR output shape becomes (32x32x12)?

It is important for me Because

after this layer I am going to use Normal Convolution. which has 128 number of kernels and kernel_size = (3,3). And I want to know HOW the shape of the kernel will be. Will it be (3x3x3)x128? OR other shape?

The Code for training works but want to know How It Actually Works.

ALSO I want to know how (depth_multiplier=#) really works.

Thank you Previously.

CodePudding user response：

Yes.

With depth_multiplier=1:

x = tf.keras.Input(shape=(32, 32, 3))
y = tf.keras.layers.DepthwiseConv2D(kernel_size = (3,3), padding='SAME')(x)
keras.Model(inputs=x, outputs=y).layers[-1].weights[0].shape

#outputs TensorShape([3, 3, 3, 1])

With depth_multiplier=4:

y = tf.keras.layers.DepthwiseConv2D(kernel_size = (3,3),depth_multiplier=4, padding='SAME')(x)
keras.Model(inputs=x, outputs=y).layers[-1].weights[0].shape

#outputs: TensorShape([3, 3, 3, 4])