How tensorflow model.weights works?-CodePudding

I have defined a nueral network model:

model = keras.models.Sequential([
    keras.layers.Flatten(input_shape = (15,)), # the input layer
    keras.layers.Dense(20, activation = 'relu'), #the hidden
    keras.layers.Dense(20, activation = 'relu'), #the hidden
    keras.layers.Dense(20, activation = 'relu'), #the hidden
    keras.layers.Dense(20, activation = 'relu'), #the hidden
    keras.layers.Dense(20, activation = 'relu'), #the hidden
    keras.layers.Dense(2) #The output layer
])

This the summery of the model:

Model: "sequential_7"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten_7 (Flatten)         (None, 15)                0         
                                                                 
 dense_40 (Dense)            (None, 20)                320       
                                                                 
 dense_41 (Dense)            (None, 20)                420       
                                                                 
 dense_42 (Dense)            (None, 20)                420       
                                                                 
 dense_43 (Dense)            (None, 20)                420       
                                                                 
 dense_44 (Dense)            (None, 20)                420       
                                                                 
 dense_45 (Dense)            (None, 2)                 42        
                                                                 
=================================================================
Total params: 2,042
Trainable params: 2,042
Non-trainable params: 0

After I trained it I expected to get weights of the form:

16X20, 21X20, .... 21X2 becose the const colum the neural network has.

But when I actually measure the shapes of model.weights I actually get matrixes

at the even colums and and vectors at the odd places:

mats = [np.array(w) for w in model.weights[0::2]]
vecs= [np.array(w) for w in model.weights[1::2]]
print([w.shape for w in mats])
print([w.shape for w in vecs])


[(15, 20), (20, 20), (20, 20), (20, 20), (20, 20), (20, 2)]
[(20,), (20,), (20,), (20,), (20,), (2,)]

My question is: Are the vectors here are just the zero lines of the mats that correspond to the const parameter at each layer?

CodePudding user response：

The vectors here are the biases that are defined for each Dense layer. Try outputting the weights of the first Dense layer for example:

import tensorflow as tf

model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape = (15,)), # the input layer
    tf.keras.layers.Dense(20, activation = 'relu'), #the hidden
    tf.keras.layers.Dense(20, activation = 'relu'), #the hidden
    tf.keras.layers.Dense(20, activation = 'relu'), #the hidden
    tf.keras.layers.Dense(20, activation = 'relu'), #the hidden
    tf.keras.layers.Dense(20, activation = 'relu'), #the hidden
    tf.keras.layers.Dense(2) #The output layer
])

print(model.layers[1].weights)

You should see the kernel weights and the biases. This corresponds to the description of the Dense layer:

Dense implements the operation: output = activation(dot(input, kernel) bias)