I'm working out the CNN course on Coursera, and as I try to solve the assignment notebook, this is the error thrown at me.
ValueError: `x` (images tensor) and `y` (labels) should have the same length. Found: x.shape = (27455, 28, 28, 1), y.shape = (7172, 28, 28, 1)
But aren't they of the same length? (the dimension that is.) The following is my code block that is creating the issue:
# GRADED FUNCTION: train_val_generators
def train_val_generators(training_images, training_labels, validation_images, validation_labels):
"""
Creates the training and validation data generators
Args:
training_images (array): parsed images from the train CSV file
training_labels (array): parsed labels from the train CSV file
validation_images (array): parsed images from the test CSV file
validation_labels (array): parsed labels from the test CSV file
Returns:
train_generator, validation_generator - tuple containing the generators
"""
### START CODE HERE
# In this section, you will have to add another dimension to the data
# So, for example, if your array is (10000, 28, 28)
# You will need to make it (10000, 28, 28, 1)
# Hint: np.expand_dims
training_images = np.expand_dims(training_images, axis=3)
validation_images = np.expand_dims(validation_images, axis=3)
print(training_images.shape)
print(validation_images.shape)
# Instantiate the ImageDataGenerator class
# Don't forget to normalize pixel values
# and set arguments to augment the images (if desired)
train_datagen = ImageDataGenerator(
# Your Code Here
rescale = 1./255,
rotation_range = 40,
width_shift_range = 0.2,
height_shift_range = 0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True,
fill_mode = 'nearest'
)
# Pass in the appropriate arguments to the flow method
train_generator = train_datagen.flow(x=training_images,
y=validation_images,
batch_size=32)
# Instantiate the ImageDataGenerator class (don't forget to set the rescale argument)
# Remember that validation data should not be augmented
validation_datagen = ImageDataGenerator(
rescale = 1./255
)
# Pass in the appropriate arguments to the flow method
validation_generator = validation_datagen.flow(x=training_images,
y=validation_images,
batch_size=32)
### END CODE HERE
return train_generator, validation_generator
After running this cell, it's working fine and adding an extra dimension to my images. The following code cell raises the above-mentioned issue.
# Test your generators
train_generator, validation_generator = train_val_generators(training_images, training_labels, validation_images, validation_labels)
print(f"Images of training generator have shape: {train_generator.x.shape}")
print(f"Labels of training generator have shape: {train_generator.y.shape}")
print(f"Images of validation generator have shape: {validation_generator.x.shape}")
print(f"Labels of validation generator have shape: {validation_generator.y.shape}")
This is my entire error messasge:
(27455, 28, 28, 1)
(7172, 28, 28, 1)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-27-c93bf4854fbc> in <module>()
1 # Test your generators
----> 2 train_generator, validation_generator = train_val_generators(training_images, training_labels, validation_images, validation_labels)
3
4 print(f"Images of training generator have shape: {train_generator.x.shape}")
5 print(f"Labels of training generator have shape: {train_generator.y.shape}")
3 frames
/usr/local/lib/python3.7/dist-packages/keras_preprocessing/image/numpy_array_iterator.py in __init__(self, x, y, image_data_generator, batch_size, shuffle, sample_weight, seed, data_format, save_to_dir, save_prefix, save_format, subset, dtype)
87 'should have the same length. '
88 'Found: x.shape = %s, y.shape = %s' %
---> 89 (np.asarray(x).shape, np.asarray(y).shape))
90 if sample_weight is not None and len(x) != len(sample_weight):
91 raise ValueError('`x` (images tensor) and `sample_weight` '
ValueError: `x` (images tensor) and `y` (labels) should have the same length. Found: x.shape = (27455, 28, 28, 1), y.shape = (7172, 28, 28, 1)
I tried searching many similar problems in StackOverflow, which talked about changing its dimensions, but my dimensions are correct I think because changing it did absolutely nothing. Any insights on this? Please help. Thanks! :_)
CodePudding user response:
In the train_datagen.flow
and validation_datagen.flow
, you make two small mistakes. For the y
parameter, you pass validation_images
, but you need to pass training_labels
and validation_labels
.
I correct the above mistakes and write full code with random images and a simple CNN model and fit it.
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import tensorflow as tf
import numpy as np
def train_val_generators(training_images, training_labels, validation_images, validation_labels):
training_images = np.expand_dims(training_images, axis=3)
validation_images = np.expand_dims(validation_images, axis=3)
print(training_images.shape)
print(validation_images.shape)
train_datagen = ImageDataGenerator(
rescale = 1./255,
rotation_range = 40,
width_shift_range = 0.2,
height_shift_range = 0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True,
fill_mode = 'nearest'
)
train_generator = train_datagen.flow(x=training_images,
y=training_labels,
batch_size=32)
validation_datagen = ImageDataGenerator(
rescale = 1./255
)
validation_generator = validation_datagen.flow(x=validation_images,
y=validation_labels,
batch_size=32)
return train_generator, validation_generator
train_generator, validation_generator = train_val_generators(
training_images = np.random.rand(27455, 28,28),
training_labels = np.random.randint(0,2,27455),
validation_images = np.random.rand(7172, 28,28),
validation_labels = np.random.randint(0,2,7172),
)
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu', input_shape=(28, 28, 1)),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Conv2D(32, 3, padding='same', activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dropout(0.4),
tf.keras.layers.Dense(2)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True))
model.fit(train_generator,
epochs=2,
validation_data=validation_generator)
Output:
(27455, 28, 28, 1)
(7172, 28, 28, 1)
Epoch 1/2
858/858 [==============================] - 25s 25ms/step - loss: 0.6933 - val_loss: 0.6931
Epoch 2/2
858/858 [==============================] - 18s 21ms/step - loss: 0.6932 - val_loss: 0.6930