One small problem of machine learning-CodePudding

I ask you a great god
When testing tensorflow2.0 BN layer has set up a network for the two simple compare that
The code below

 
Model. The add (tf) keras. The layers. Dense (20, input_shape=(10))) # layer 1 
Model. The add (tf) keras. The layers. The Activation (' relu) # Activation layer 
Model. The add (tf) keras. The layers. Dense (10)) # layer 2 
Model. The add (tf) keras. The layers. The Activation (' relu) # Activation layer 
Model. The add (tf) keras. The layers. Dense (1)) # output layer 
Model. The add (tf) keras. The layers. The Activation (' softmax ') # Activation layer

After the summary of the results is
Model: "sequential"
_________________________________________________________________
The Output Layer (type) Shape Param #
=================================================================
Dense (dense) (20) None, 220
_________________________________________________________________
Activation (activation) (20) None, 0
_________________________________________________________________
Dense_1 (Dense) (None, 10) 210
_________________________________________________________________
Activation_1 (Activation) (None, 10) 0
_________________________________________________________________
Dense_2 (Dense) (None, 1) 11
_________________________________________________________________
Activation_2 (Activation) (None, 1) 0
=================================================================
Total params: 441
Trainable params: 441
Non - trainable params: 0

The summary of the results after adding BN layer is

 
# after adding BN layer 
Model2=tf. Keras. Sequential () 
Model2. Add (tf) keras. The layers. Dense (20, input_shape=(10))) # layer 1 
Model2. Add (tf) keras. The layers. BatchNormalization ()) 
Model2. Add (tf) keras. The layers. The Activation (' relu) # Activation layer 
Model2. Add (tf) keras. The layers. Dense (10)) # layer 2 
Model2. Add (tf) keras. The layers. The Activation (' relu) # Activation layer 
Model2. Add (tf) keras. The layers. Dense (1)) # output layer 
Model2. Add (tf) keras. The layers. The Activation (' softmax ') # Activation layer

Model: "sequential_2
"_________________________________________________________________
The Output Layer (type) Shape Param #
=================================================================
Dense_6 (Dense) (20) None, 220
_________________________________________________________________
Batch_normalization_1 (Batch (None, 20) 80
_________________________________________________________________
Activation_6 (Activation) (20) None, 0
_________________________________________________________________
Dense_7 (Dense) (None, 10) 210
_________________________________________________________________
Activation_7 (Activation) (None, 10) 0
_________________________________________________________________
Dense_8 (Dense) (None, 1) 11
_________________________________________________________________
Activation_8 (Activation) (None, 1) 0
=================================================================
Total params: 521
Trainable params: 481
Non - trainable params: 40

Want to ask according to the formula of BN layer to z=gamma z + beta
BN layer should not only training 20 * 2 parameters, why is training 80 here