There are many models already trained on different tasks on Github for images, NLP, etc. How I can import the weights of these models and build a custom model on top of them? Should I build a model from scratch for them and match the number and shape of each layer or how I should proceed please?
For example, suppose I trained the CNN model below, then how to transfer it and use it later with other custom layers (different input shape for example)?
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_images, train_labels, epochs=10,
validation_data=(test_images, test_labels))
CodePudding user response:
you should be able to save models like this: model.save('my_model.h5')
and load models like this: new_model = tf.keras.models.load_model('my_model.h5')
so summing you have a model from e.g. github in the *.h5 format just load it and than you can do new_model.summary()
to see the model. when doing transfer learning you usually want to freeze the first layers and only train the last few layers which you can do with:
new_model.trainable = False
to freeze the whole model and:
for layer in new_model.layers[-3:]:
layer.trainable = True
to unfreeze the last three layers