Creating a normalized dataset from .jpgs in Keras-CodePudding

I am currently learning to work with TensorFlow/Keras and having some trouble loading images as a dataset.

For context, I've downloaded the Pizza/Not Pizza dataset from Kaggle and I just want to build a naive binary classification model.

From the Keras documentation, I should be using the image_dataset_from_directory function, but there's a problem. It imposes that I give a size for the images as an argument to the function, but that messes up the dataset. I've already noticed that images in the DS are either 512 x 384 or 384 x 512, so all I want to do is load the thousand images, apply a transpose to them, and finally transform everything into tensors.

So, my question is: how do I load the images from a directory without imposing a certain size/shape beforehand?

CodePudding user response：

You could rotate all the images that have the first shape 384 (or the other way around, it makes no difference) beforehand.

This scripts rotates your images and saves them all in a new folder:

import imageio
import numpy as np
import os
import ndimage

outPath = "rotated_images/"
path = "images/"

# iterate through the names of contents of the folder
for image_path in os.listdir(path):

    # create the full input path and read the file
    input_path = os.path.join(path, image_path)
    image_to_rotate = imageio.imread(input_path)
    
    # rotating all images with first shape 384
    if image_to_rotate.shape[0] == 384:
        # rotate the image
        rotated = ndimage.rotate(image_to_rotate, 90)
    else:
        rotated = image_to_rotate

    fullpath = os.path.join(outPath, image_path)
    imageio.imsave(fullpath, rotated)

After that you can call image_dataset_from_directory as you wanted on the outPath folder.

Something similar can be found here.