In the below neural network, the 2nd layer is non trainable. when calculating gradient for the 1st layer, however, will the 2nd layer participate in?
In short, when a layer is set to non-trainable, will it affect the gradient descent of other layers?
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_2 (Dense) (None, 256) 200960
my_pca_2 (my_pca) (None, 10) 2570
=================================================================
Total params: 203,530
Trainable params: 200,960
Non-trainable params: 2,570
_________________________________________________________________
CodePudding user response:
The second layer in the neural network is set to non-trainable. This only means that the weights in that layer will not be updated during the training process.
However, when calculating the gradient for the first layer, the second layer will still participate. This is because in the process of calculating the gradient the outputs of the second layer are used as inputs to the first layer, and therefore have an effect on the gradients calculated for the first layer. In other words, the non-trainable status of a layer only affects its own weight updates, but not its impact on the gradients of other layers.
It is the essense of backpropagation using the chain rule that states how you even calculate the gradients of each layer. There is no way a layer could not effect the gradients of its predecessor layer.