When testing tensorflow2.0 BN layer has set up a network for the two simple compare that
The code below
Model. The add (tf) keras. The layers. Dense (20, input_shape=(10))) # layer 1
Model. The add (tf) keras. The layers. The Activation (' relu) # Activation layer
Model. The add (tf) keras. The layers. Dense (10)) # layer 2
Model. The add (tf) keras. The layers. The Activation (' relu) # Activation layer
Model. The add (tf) keras. The layers. Dense (1)) # output layer
Model. The add (tf) keras. The layers. The Activation (' softmax ') # Activation layer
After the summary of the results is
Model: "sequential"
_________________________________________________________________
The Output Layer (type) Shape Param #
=================================================================
Dense (dense) (20) None, 220
_________________________________________________________________
Activation (activation) (20) None, 0
_________________________________________________________________
Dense_1 (Dense) (None, 10) 210
_________________________________________________________________
Activation_1 (Activation) (None, 10) 0
_________________________________________________________________
Dense_2 (Dense) (None, 1) 11
_________________________________________________________________
Activation_2 (Activation) (None, 1) 0
=================================================================
Total params: 441
Trainable params: 441
Non - trainable params: 0
The summary of the results after adding BN layer is
# after adding BN layer
Model2=tf. Keras. Sequential ()
Model2. Add (tf) keras. The layers. Dense (20, input_shape=(10))) # layer 1
Model2. Add (tf) keras. The layers. BatchNormalization ())
Model2. Add (tf) keras. The layers. The Activation (' relu) # Activation layer
Model2. Add (tf) keras. The layers. Dense (10)) # layer 2
Model2. Add (tf) keras. The layers. The Activation (' relu) # Activation layer
Model2. Add (tf) keras. The layers. Dense (1)) # output layer
Model2. Add (tf) keras. The layers. The Activation (' softmax ') # Activation layer
Model: "sequential_2
"_________________________________________________________________
The Output Layer (type) Shape Param #
=================================================================
Dense_6 (Dense) (20) None, 220
_________________________________________________________________
Batch_normalization_1 (Batch (None, 20) 80
_________________________________________________________________
Activation_6 (Activation) (20) None, 0
_________________________________________________________________
Dense_7 (Dense) (None, 10) 210
_________________________________________________________________
Activation_7 (Activation) (None, 10) 0
_________________________________________________________________
Dense_8 (Dense) (None, 1) 11
_________________________________________________________________
Activation_8 (Activation) (None, 1) 0
=================================================================
Total params: 521
Trainable params: 481
Non - trainable params: 40
Want to ask according to the formula of BN layer to z=gamma z + beta
BN layer should not only training 20 * 2 parameters, why is training 80 here