Why does my TensorFlow model correctly predict JPG and PNG images but incorrectly predict frames from real time video stream? All frames in the real time video stream are all being incorrectly classified as class 1.
Attempt: I saved a PNG image from the realtime video stream. When I saved the PNG image separately and tested it, the model correctly classifies it. When a similar image is a frame in the real time video stream it is incorrectly classified. The PNG images and real time video stream frames have identical content visually (background, lighting condition, camera angle, etc.).
Structure of my model:
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
rescaling_2 (Rescaling) (None, 180, 180, 3) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 180, 180, 16) 448
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 90, 90, 16) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 90, 90, 32) 4640
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 45, 45, 32) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 45, 45, 64) 18496
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 22, 22, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 30976) 0
_________________________________________________________________
dense_2 (Dense) (None, 128) 3965056
_________________________________________________________________
dense_3 (Dense) (None, 3) 387
=================================================================
Total params: 3,989,027
Trainable params: 3,989,027
Non-trainable params: 0
_________________________________________________________________
Found 1068 files belonging to 3 classes.
Realtime prediction code: (updated after Keertika's help!)
def testModel(imageName):
import cv2
from PIL import Image
from tensorflow.keras.preprocessing import image_dataset_from_directory
batch_size = 32
img_height = 180
img_width = 180
img = keras.preprocessing.image.load_img(
imageName,
target_size=(img_height, img_width),
interpolation = "bilinear",
color_mode = 'rgb'
)
#preprocessing different here
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) #Create a batch
predictions = new_model.predict(img_array)
score = predictions[0]
classes = ['1', '2','3']
prediction = classes[np.argmax(score)]
print(
"This image {} most likely belongs to {} with a {:.2f} percent confidence."
.format(imageName, classes[np.argmax(score)], 100 * np.max(score))
)
return prediction
Training code:
#image_dataset_from_directory returns a tf.data.Dataset that yields batches of images from
#the subdirectories class_a and class_b, together with labels 0 and 1.
from keras.preprocessing import image
directory_test = "/content/test"
tf.keras.utils.image_dataset_from_directory(
directory_test, labels='inferred', label_mode='int',
class_names=None, color_mode='rgb', batch_size=32, image_size=(256,
256), shuffle=True, seed=None, validation_split=None, subset=None,
interpolation='bilinear', follow_links=False,
crop_to_aspect_ratio=False
)
tf.keras.utils.image_dataset_from_directory(directory_test, labels='inferred')
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
directory_test,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
Is the accuracy being affected by the reshaping in the realtime prediction code? I do not understand why frame predictions are incorrect, but single JPG and PNG image predictions are correct. Thank you for any help!
CodePudding user response:
the reason for the real time prediction not correct is because of the preprocessing. The preprocessing of the inference code should be always same as the preprocessing used while training. Use tf.keras.preprocessing.image.load_img in your real-time prediction code but it takes image path to load the image. so you can save each frame by name "sample.png" and pass this path to tf.keras.preprocessing.image.load_img. this should solve the issue. and use the resize method "bilinear" because that was used for training data