Tensorflow Image classification get train_images/train_X and train_labels/train

I'm working on a tensorflow model for identifying different butterfies. I'm using Neural networks for this and I'm reading images from folders and all the data gets split in a train dataset and validation dataset, but I want split these like this:

(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

instead of:

train_ds = utils.image_dataset_from_directory(data_dir, validation_split=0.2, subset="training", seed=123, image_size=(img_height, img_width), batch_size=BATCH_SIZE)
val_ds = utils.image_dataset_from_directory(data_dir, validation_split=0.2, subset="validation", seed=123, image_size=(img_height, img_width), batch_size=BATCH_SIZE)

I have tried doing this, but it makes the accuracy of my model real crap so I don't think it's correct:

train_images = np.concatenate([x for x, y in train_ds], axis=0)
train_labels = np.concatenate([y for x, y in train_ds], axis=0)
test_images = np.concatenate([x for x, y in val_ds], axis=0)
test_labels = np.concatenate([y for x, y in val_ds], axis=0)

I have tried a lot of methods from stackoverflow, but they also don't work.

My model:

model = tf.keras.Sequential([
   # Please reread this link for a better understanding of the data being entered:
   #https://www.codespeedy.com/determine-input-shape-in-keras-tensorflow/
   layers.Conv2D(32, (3, 3), activation='relu', input_shape=(180, 180, 3)),
   layers.MaxPooling2D((2, 2)),
   layers.Conv2D(64, (3, 3), activation='relu'),
   layers.MaxPooling2D((2, 2), strides=2),
   layers.Flatten(),
   layers.Dropout(0.2, input_shape=(180, 180, 3)),
   layers.Dense(64, activation='relu'), 
   layers.Dense(5, activation='softmax') # there are 5 classes_names/folders or 5 kinds of butterflies
])

CodePudding user response：

for image_batch, labels_batch in train_ds:
   print(image_batch.shape)
   print(labels_batch.shape)
   break

this code I found here shows that the dataset u get is just an iterator u can iterate over.

so this should work for generating the list u wanted

list_images = [step[0] for step in train_ds]
list_labels = [step[1] for step in train_ds]

but there I a catch to you would have to set your batch size when generating the dataset from the folder. Otherwise, this will work but the items list_images are lists with images with the length of your batch size.

CodePudding user response：

Fixed the problem:

for x, y in train_ds:
  train_images = x
  train_labels = y

train_images & train_labels don't have to be initilaized!