Home > Software design >  how can i continue training from last epoch?
how can i continue training from last epoch?

Time:06-06

I saved the history of training by

history = model.fit(train_generator, epochs=epochs, steps_per_epoch=train_steps, 
verbose=1, callbacks=callbacks, validation_data=val_generator, 
validation_steps=val_steps,batch_size=16)
with open('history_epochs.pkl', 'wb') as f:
    dump(history.history, f)

Can I use the file of history to continue from the last epoch? and how please

CodePudding user response:

You can use the pickle file to save and load your model and continue training:

  1. Create your model
  2. Train your model
  3. Save your model as a pickle file

Code for the above steps:

import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import joblib

(X_train, y_train), (X_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat','Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

fig, axes = plt.subplots(2,5,figsize=(15,6))
for idx, axe in enumerate(axes.flatten()):
    axe.axis('off')
    idx_img = np.argwhere(y_train==idx)[0][0]
    axe.imshow(X_train[idx_img], cmap=plt.cm.binary)
    axe.set_title(class_names[y_train[idx_img]])


X_train = X_train.astype('float32') / 255.0
X_train = tf.expand_dims(X_train, axis=-1)

X_test = X_test.astype('float32') / 255.0
X_test = tf.expand_dims(X_test, axis=-1)

y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)

model = tf.keras.Sequential()
model.add(tf.keras.Input(shape=(X_train.shape[1], X_train.shape[1], 1)))
model.add(tf.keras.layers.Conv2D(128, (3,3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Conv2D(64, (3,3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Conv2D(128, (3,3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512, activation='relu'))
model.add(tf.keras.layers.Dropout(rate=.4))            
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.Dropout(rate=.4))
model.add(tf.keras.layers.Dense(10, activation='sigmoid'))        
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()


model.fit(X_train, y_train, batch_size=256, epochs=3, verbose=1, validation_split=.2)
model.evaluate(X_test, y_test, verbose=1)

joblib.dump(model, 'model.pkl')

Output:

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 26, 26, 128)       1280      
                                                                 
 batch_normalization (BatchN  (None, 26, 26, 128)      512       
 ormalization)                                                   
                                                                 
 dropout (Dropout)           (None, 26, 26, 128)       0         
                                                                 
 conv2d_1 (Conv2D)           (None, 24, 24, 64)        73792     
                                                                 
 batch_normalization_1 (Batc  (None, 24, 24, 64)       256       
 hNormalization)                                                 
                                                                 
 dropout_1 (Dropout)         (None, 24, 24, 64)        0         
                                                                 
 conv2d_2 (Conv2D)           (None, 22, 22, 128)       73856     
                                                                 
 batch_normalization_2 (Batc  (None, 22, 22, 128)      512       
 hNormalization)                                                 
                                                                 
 dropout_2 (Dropout)         (None, 22, 22, 128)       0         
                                                                 
 flatten (Flatten)           (None, 61952)             0         
                                                                 
 dense (Dense)               (None, 512)               31719936  
                                                                 
 dropout_3 (Dropout)         (None, 512)               0         
                                                                 
 dense_1 (Dense)             (None, 128)               65664     
                                                                 
 dropout_4 (Dropout)         (None, 128)               0         
                                                                 
 dense_2 (Dense)             (None, 10)                1290      
                                                                 
=================================================================
Total params: 31,937,098
Trainable params: 31,936,458
Non-trainable params: 640
_________________________________________________________________
Epoch 1/3
188/188 [==============================] - 19s 81ms/step - loss: 0.8264 - accuracy: 0.7398 - val_loss: 3.4644 - val_accuracy: 0.1245
Epoch 2/3
188/188 [==============================] - 14s 75ms/step - loss: 0.4896 - accuracy: 0.8283 - val_loss: 1.2240 - val_accuracy: 0.5802
Epoch 3/3
188/188 [==============================] - 14s 77ms/step - loss: 0.4055 - accuracy: 0.8544 - val_loss: 0.3711 - val_accuracy: 0.8675
313/313 [==============================] - 2s 5ms/step - loss: 0.3850 - accuracy: 0.8591
[0.3849639296531677, 0.8590999841690063]

INFO:tensorflow:Assets written to: ram://****/assets
['model.pkl']
  1. Load your model
  2. Continue Training

Code for the above steps:

model = joblib.load("/content/model.pkl")
model.fit(X_train, y_train, batch_size=256, epochs=2, verbose=1, validation_split=.2)
model.evaluate(X_test, y_test, verbose=1)

Output:

Epoch 1/2
188/188 [==============================] - 17s 84ms/step - loss: 0.4414 - accuracy: 0.8496 - val_loss: 0.3449 - val_accuracy: 0.8697
Epoch 2/2
188/188 [==============================] - 15s 82ms/step - loss: 0.3704 - accuracy: 0.8708 - val_loss: 0.2884 - val_accuracy: 0.8965
313/313 [==============================] - 1s 5ms/step - loss: 0.3114 - accuracy: 0.8938
[0.31136029958724976, 0.8938000202178955]

enter image description here

CodePudding user response:

Below applies to any deep learning library …

  1. Build model
  2. Train model.
  3. Save model (should be saving parameters/weights as well).
  4. Load model from the saved file (any time any where).
  5. Continue with more training.
  • Related