Home > Blockchain >  Discrepancy in the number of trainable parameters between model.summary and len(conv_model.trainable
Discrepancy in the number of trainable parameters between model.summary and len(conv_model.trainable

Time:11-18

Consider this tensorflow python code that loads a pretrained model:

import tensorflow as tf
conv_model = keras.applications.vgg16.VGG16(
    weights='imagenet',
    include_top=False)
conv_model.trainable=False
print("Number of trainable weights after freezing: ", len(conv_model.trainable_weights))
conv_model.trainable=True
print("Number of trainable weights after defreezing: ", len(conv_model.trainable_weights))

and I got printed

Number of trainable weights after freezing:  0
Number of trainable weights after defreezing:  26

However, if I do

conv_model.trainable=True
conv_model.summary()

I get:

Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0

and if I freeze I get 0 trainable paraemters.

Why there is this discrepancy between model.summary() and the other method?

CodePudding user response:

Length of the weights doesnt give the total parameters. You should use:

from keras.utils.layer_utils import count_params
np.sum([count_params(p) for p in conv_model.trainable_weights])
#14714688

instead of,

len(conv_model.trainable_weights)

Length gives the number of kernels and biases and each of them can be inspected by:

for p in conv_model.trainable_weights:
   print (p.name, p.shape, np.cumprod(p.shape)[-1], count_params(p))

#outputs 26 conv layers  shape          params params

block1_conv1/kernel:0   (3, 3, 3, 64)    1728   1728
block1_conv1/bias:0     (64,)            64     64
block1_conv2/kernel:0   (3, 3, 64, 64)   36864  36864
...
block5_conv3/kernel:0   (3, 3, 512, 512) 2359296 2359296
block5_conv3/bias:0     (512,)           512     512
  • Related