I'm working on a tensorflow model for identifying different butterfies. I'm using Neural networks for this and I'm reading images from folders and all the data gets split in a train dataset and validation dataset, but I want split these like this:
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
instead of:
train_ds = utils.image_dataset_from_directory(data_dir, validation_split=0.2, subset="training", seed=123, image_size=(img_height, img_width), batch_size=BATCH_SIZE)
val_ds = utils.image_dataset_from_directory(data_dir, validation_split=0.2, subset="validation", seed=123, image_size=(img_height, img_width), batch_size=BATCH_SIZE)
I have tried doing this, but it makes the accuracy of my model real crap so I don't think it's correct:
train_images = np.concatenate([x for x, y in train_ds], axis=0)
train_labels = np.concatenate([y for x, y in train_ds], axis=0)
test_images = np.concatenate([x for x, y in val_ds], axis=0)
test_labels = np.concatenate([y for x, y in val_ds], axis=0)
I have tried a lot of methods from stackoverflow, but they also don't work.
My model:
model = tf.keras.Sequential([
# Please reread this link for a better understanding of the data being entered:
#https://www.codespeedy.com/determine-input-shape-in-keras-tensorflow/
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(180, 180, 3)),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2), strides=2),
layers.Flatten(),
layers.Dropout(0.2, input_shape=(180, 180, 3)),
layers.Dense(64, activation='relu'),
layers.Dense(5, activation='softmax') # there are 5 classes_names/folders or 5 kinds of butterflies
])
CodePudding user response:
for image_batch, labels_batch in train_ds:
print(image_batch.shape)
print(labels_batch.shape)
break
this code I found here shows that the dataset u get is just an iterator u can iterate over.
so this should work for generating the list u wanted
list_images = [step[0] for step in train_ds]
list_labels = [step[1] for step in train_ds]
but there I a catch to you would have to set your batch size when generating the dataset from the folder. Otherwise, this will work but the items list_images are lists with images with the length of your batch size.
CodePudding user response:
Fixed the problem:
for x, y in train_ds:
train_images = x
train_labels = y
train_images & train_labels don't have to be initilaized!