How to get an image to array, Tensorflow 1.9-CodePudding

So I have to use Tensorflow 1.9 for system specific reasons. I want to train a cnn with a custom dataset consisting of images. The folder structure looks very much like this:

./
    circles
    - circle-0.jpg
    - circle-1.jpg
    - ...
    hexagons
    - hexagon-0.jpg
    - hexagon-1.jpg
    - ...
    ...

So the example I have to work with uses MNIST and has these two particular lines of code:

mnist_dataset = tf.keras.datasets.mnist.load_data('mnist_data')
(x_train, y_train), (x_test, y_test) = mnist_dataset

In my work, I also have to use this data format (x_train, y_train), (x_test, y_test), which seems to be quite common. As far as I was able to find out up to now, the format of those datasets are: (image_data, label), and is something like ((60000, 28, 28), (60000,)), at least with the MNIST dataset. The image_data here is supposedly of dtype uint8 (according to this post). I was able to find out, that a tf.data.Dataset() object looks like the tuples I need here (image_data, label).

So far so good. But a few questions arise from this information which I wasn't able to figure out yet, and where I would kindly request your help:

(60000, 28, 28) means 60k a 28 x 28 image value array, right?
If 1. is right, how do I get my images (like in the directory structure I described above) into this format? Is there a function which yields an array that I can use like that?
I know I need some kind of generator function which should get all the images with label, because in Tensorflow 1.9 the tf.keras.utils.image_dataset_from_directory() does not seem to exist yet.
How do the labels actually look like? For example, with my directory structure, would I have something like this:

(A)

File	Label
circle-0.jpg	circle
circle-233.jpg	circle
hexagon-1.jpg	hexagon
triangle-12.jpg	triangle

or (B)

File	Label
circle-0.jpg	circle-0
circle-233.jpg	circle-233
hexagon-1.jpg	hexagon-1
triangle-12.jpg	triangle-12

, where the respective image is already converted to a "(60000, 28, 28)" format? It seems as if I need to create all my functions by myself, since there does not seem to be a good function which takes a directory structure like mine to a dataset which can be utilized by Tensorflow 1.9, or is there?. I know of the tf.keras.preprocessing.image.ImageDataGenerator and image_dataset_from_directory as well as flow_from_directory(), however, all of them don't seem to bring me my desired dataset value tuple format.

I would really appreciate any help!

CodePudding user response：

You have to build a custom data generator for that. If you have two arrays, train_paths containing the paths to images and train_labels containing the labels for the images, then this function (datagen) would yield the images as array and with their respective label as a tuple (image_array, label).
And I have also added a way to integer-encode your labels, with a dictionary encode_label

For example, train_paths and train_labels should look like this:

train_paths = np.array(['path/to/image1.jpg','path/to/image2.jpg','path/to/image3.jpg'])
train_labels = np.array(['circle','square','hexagon'])

where the image of path 'path/to/image1.jpg' has a label of 'circle', the image of path 'path/to/image2.jpg' has a label of 'square'.

This generator function will return data as a batch and you can write your custom augmentation techniques as well (inside the augment function)

import tensorflow as tf

# Hyperparameters
HEIGHT = 224 # Image height
WIDTH = 224 # Image width
CHANNELs = 3 # Image channels

# This function will encode your labels
encode_label = {'hexagon':0, 'circle':1, 'square':2}


def augment(image):
    # All your augmentation techniques are done here
    return image

def encode_labels(labels):
    encoded = []
    for label in labels:
        encoded.append(encode_label[label])
    return encoded

def open_images(paths):
    '''
    Given a list of paths to images, this function loads
    the images from the paths, then augments them, then returns it as a batch
    '''
    images = []
    for path in paths:
        image = tf.keras.preprocessing.image.load_img(path, target_size=(HEIGHT, WIDTH, CHANNELS))
        image = np.array(image)
        image = augment(image)
        images.append(image)
    return np.array(images)

# This is the data generator
def datagen(paths, labels, batch_size=32):
    for x in range(0,len(paths), batch_size):
        # Load batch of images
        batch_paths = paths[x:x batch_size]
        batch_images = open_images(batch_paths)
        # Load batch of labels
        batch_labels = labels[x:x batch_size]
        batch_labels = encode_labels(batch_labels)
        batch_labels = np.array(batch_labels, dtype='float').reshape(-1)
        
        yield batch_images, batch_labels

If you cannot get tf.keras.preprocessing.image.load_img working in your tensorflow version, try using an alternative method to load image and resize it. One alternative way would be to load the image with matplotlib and then resizing it with skimage. So the open_images function would be this:

import matplotlib
from skimage.transform import resize

def open_images(paths):
    '''
    Given a list of paths to images, this function loads
    the images from the paths, then augments them, then returns it as a batch
    '''
    images = []
    for path in paths:
        image = matplotlib.image.imread(path)
        image = np.array(image)
        image = resize(image, (HEIGHT, WIDTH, CHANNELS))
        image = augment(image)
        images.append(image)
    return np.array(images)