Is sigmoid function only applicable after dense() layer?-CodePudding

I am making a network which is similar to SE-Net(https://github.com/titu1994/keras-squeeze-excite-network/blob/master/se.py) using keras, but quite different with it.

Suppose that I want to make some layer sequence like :

import keras

Input = keras.model.Input((None,None,3))
x1 = keras.layers.Conv2d(filters = 32, kernel_size = (3,3))(Input)
x_gp = keras.layers.GlobalAveragePooling()(x1)
x2 = keras.layers.Conv2d(filters = 32, kernel_size = (1,1))(x_gp)
x3 = keras.layers.Conv2d(filters = 8, kernel_size = (1,1))(x2)
x2_ = keras.layers.Conv2d(filters = 32, kernel_size = (1,1))(x3)
x_se = keras.activation.sigmoid()(x2_)

I want to know that applying x_se like this is programmable. Please tell me if I am doing wrong.

CodePudding user response：

you can for sure experiment sigmoid as an activation for cnn layers too but the reason why sigmoid is not used with cnn layers are:

1. Sigmoid function is monotonic but it's derivative is not therefore there is a possibility that your training can be stuck

2. Sigmoid range:[0,1]

if you are experimenting sigmoid with cnn layers then I would suggest you to use it only for few layers. You can give swish a try.