is it possible to create a layer that perform only Pointwise Convolution part of keras.layers.Separa-CodePudding

I created the following CNN model:

    model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(16, 4, strides=4, input_shape=(12,12,1)),
    tf.keras.layers.SeparableConv2D( 1, 3, depth_multiplier=1),
    tf.keras.layers.Conv2D(8, 1, activation='relu'),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(2)
    ])
    print(model.summary())

My intension was that the seperableConv2D layer will create a single 3x3 kernel that will operate separately on each one of the 16 3x3 input images and will result in 16 single numbers. However, the result was that it learned 3x3x16 kernel and resulted in a single number. After reading the explanation regarding Seperable2D in here, I understood that it is training different 3x3 for each one of the channels (which I could live with) but then merges these 16 numbers to 1. My questions are:

Is there a way (using SeperableConv2 or any other way) to avoid the last step of going from 1X1X16 to 1X1X1?
Is there a way to train a single 3x3 kernel that will work on all 16 channels separately and create 1x1x16 output?

CodePudding user response：

Keras built-in convolution layers are operating in a channel-wise fashion. Therefore, it trains one kernel per channel. What you can do to force keras to train only one single kernel is to increase your data's dimension with size one to mimic a single channel or implement your own version of convolution layer.

To answer your questions specifically:

There is not a way I am aware of to change the behavior of SeperableConv2 to achieve what you want. However, there is always the possibility to build a custom layer that omits just the merging step and do what you want.
Yes, as I said, you can permute and reshape your data with tf.keras.layers.Permute((3, 1, 2)) and tf.keras.layers.Reshape((16,3,3,1)). You can then use a higher dimensional convolution tf.keras.layers.Conv3D(1, (1,3,3)). Keras will initialize one 1x3x3 kernel that operates on your single channel for your 16 images. Your output then should be (16,1,1,1). That should be what you want to achieve.

CodePudding user response：

1.You can use a DepthwiseConv2D layer, it is the first part of the seperableConv2D layer - it convolutes each channel seperately, each with its own kernel.

2.Yes. You can reshape your data to one channel, such that each channel is next to each other. Then perform a convolution, with one 3x3 kernel, and a stride of 3. Then reshape it again to 16 channels. That way each channel will be convoluted with the same kernel. You just need to use a stride of 3, so that two channels are not mixed in the convolution.

model = tf.keras.Sequential([
 tf.keras.layers.Conv2D(16, 4, strides=4, input_shape=(12,12,1)),# -> (3,3,16)
 tf.keras.layers.Reshape((-1, 3, 3*16, 1)), # -> (3,48,1)
 tf.keras.layers.Conv2D(1, 3, strides=3), # -> (1,16,1)
 tf.keras.layers.Reshape((-1, 1, 1, 16)) # -> (1,1,16)
 ....

In this case, there is no difference between DepthwiseConv2D and Conv2 layer, since we had just one channel.

Alternatively, using the functional api, you could create a convolution layer, then get each channel and perform the convolution on that channel alone (all with the same conv layer), and then concatenate the channels back together. I think this should work:

inputs = Input(shape=(12,12,1))
x = inputs
x = Conv2D(16, 4, strides=4)(x)
conv_layer = Conv2D(1,3)
channels = []
for i in range(16):
    channel = tf.expand_dims(x[:,:,:,i], -1)
    channels.append(conv_layer(channel))
x = Concatenate(axis=-1)(channels)
...
model = Model(inputs=inputs, outputs=x)