Kears LeNet High Training & Validation accuracy but Low Testing accuracy-CodePudding

I am trying to train the mnist database using the LeNet Architecture.

I downloaded the mnist_png images from github (https://github.com/myleott/mnist_png) and it had over 50000 images. I am trying to build a LeNet model for the prediction of handwritten numbers using the LeNet Architecture which was written using keras

Code for generating images.

train_ds = tf.keras.utils.image_dataset_from_directory(
  'mnist_png/training/',
  validation_split = 0.2,
  subset = "training",
  seed = 123,
  image_size = (32, 32),
  batch_size = 100)

val_ds = tf.keras.utils.image_dataset_from_directory(
  'mnist_png/training/',
  validation_split = 0.2,
  subset = "validation",
  seed = 123,
  image_size = (32, 32),
  batch_size = 100)

test_ds = tf.keras.utils.image_dataset_from_directory(
  'mnist_png/testing/',
  seed = 123,
  image_size = (32, 32),
  batch_size = 1000)

Output


Found 40818 files belonging to 7 classes.
Using 32655 files for training.
Found 40818 files belonging to 7 classes.
Using 8163 files for validation.
Found 10000 files belonging to 10 classes.

Input shape = (32, 32, 3)

My model summary

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 28, 28, 6)         456       
                                                                 
 average_pooling2d (AverageP  (None, 14, 14, 6)        0         
 ooling2D)                                                       
                                                                 
 activation (Activation)     (None, 14, 14, 6)         0         
                                                                 
 conv2d_1 (Conv2D)           (None, 10, 10, 16)        2416      
                                                                 
 average_pooling2d_1 (Averag  (None, 5, 5, 16)         0         
 ePooling2D)                                                     
                                                                 
 activation_1 (Activation)   (None, 5, 5, 16)          0         
                                                                 
 conv2d_2 (Conv2D)           (None, 1, 1, 120)         48120     
                                                                 
 flatten (Flatten)           (None, 120)               0         
                                                                 
 dense (Dense)               (None, 84)                10164     
                                                                 
 dense_1 (Dense)             (None, 10)                850       
                                                                 
=================================================================
Total params: 62,006
Trainable params: 62,006
Non-trainable params: 0

Model compiled with this code

model.compile(optimizer='adam', loss=losses.sparse_categorical_crossentropy, metrics=['accuracy'])

I have trained it for 10 epochs and I get this output -

Epoch 1/10
327/327 [==============================] - 31s 79ms/step - loss: 0.9729 - accuracy: 0.6456 - val_loss: 0.3609 - val_accuracy: 0.8951
Epoch 2/10
327/327 [==============================] - 25s 77ms/step - loss: 0.3036 - accuracy: 0.9021 - val_loss: 0.2276 - val_accuracy: 0.9330
Epoch 3/10
327/327 [==============================] - 28s 85ms/step - loss: 0.2170 - accuracy: 0.9307 - val_loss: 0.1862 - val_accuracy: 0.9389
Epoch 4/10
327/327 [==============================] - 29s 89ms/step - loss: 0.1778 - accuracy: 0.9433 - val_loss: 0.1892 - val_accuracy: 0.9401
Epoch 5/10
327/327 [==============================] - 25s 76ms/step - loss: 0.1521 - accuracy: 0.9519 - val_loss: 0.1692 - val_accuracy: 0.9476
Epoch 6/10
327/327 [==============================] - 27s 83ms/step - loss: 0.1392 - accuracy: 0.9553 - val_loss: 0.1340 - val_accuracy: 0.9588
Epoch 7/10
327/327 [==============================] - 26s 79ms/step - loss: 0.1203 - accuracy: 0.9609 - val_loss: 0.1131 - val_accuracy: 0.9632
Epoch 8/10
327/327 [==============================] - 25s 76ms/step - loss: 0.1128 - accuracy: 0.9644 - val_loss: 0.1170 - val_accuracy: 0.9644
Epoch 9/10
327/327 [==============================] - 27s 81ms/step - loss: 0.1061 - accuracy: 0.9663 - val_loss: 0.1051 - val_accuracy: 0.9659
Epoch 10/10
327/327 [==============================] - 29s 89ms/step - loss: 0.0968 - accuracy: 0.9699 - val_loss: 0.0950 - val_accuracy: 0.9705

When I run model.evaluate(test), i get a high loss and and a low accuracy.

10/10 [==============================] - 4s 200ms/step - loss: 9.2694 - accuracy: 0.0656

Is there any reason for that?

CodePudding user response：

Nothing seems obviously wrong. In test_ds try setting shuffle=False. To get a clue try running model.evaluate on val_ds and see if it gives the correct result. Only other thing I can think of is that something is amiss with the test data. Take a look at a few of the images and see if their associated label is corect.