So I have to use Tensorflow 1.9 for system specific reasons. I want to train a cnn with a custom dataset consisting of images. The folder structure looks very much like this:
./
circles
- circle-0.jpg
- circle-1.jpg
- ...
hexagons
- hexagon-0.jpg
- hexagon-1.jpg
- ...
...
So the example I have to work with uses MNIST and has these two particular lines of code:
mnist_dataset = tf.keras.datasets.mnist.load_data('mnist_data')
(x_train, y_train), (x_test, y_test) = mnist_dataset
In my work, I also have to use this data format (x_train, y_train), (x_test, y_test)
, which seems to be quite common. As far as I was able to find out up to now, the format of those datasets are: (image_data, label)
, and is something like ((60000, 28, 28), (60000,))
, at least with the MNIST dataset. The image_data
here is supposedly of dtype
uint8
(according to this post). I was able to find out, that a tf.data.Dataset()
object looks like the tuples I need here (image_data, label)
.
So far so good. But a few questions arise from this information which I wasn't able to figure out yet, and where I would kindly request your help:
(60000, 28, 28)
means 60k a 28 x 28 image value array, right?- If 1. is right, how do I get my images (like in the directory structure I described above) into this format? Is there a function which yields an array that I can use like that?
- I know I need some kind of generator function which should get all the images with label, because in Tensorflow 1.9 the
tf.keras.utils.image_dataset_from_directory()
does not seem to exist yet. - How do the labels actually look like? For example, with my directory structure, would I have something like this:
(A)
File | Label |
---|---|
circle-0.jpg | circle |
circle-233.jpg | circle |
hexagon-1.jpg | hexagon |
triangle-12.jpg | triangle |
or (B)
File | Label |
---|---|
circle-0.jpg | circle-0 |
circle-233.jpg | circle-233 |
hexagon-1.jpg | hexagon-1 |
triangle-12.jpg | triangle-12 |
, where the respective image is already converted to a "(60000, 28, 28)
" format? It seems as if I need to create all my functions by myself, since there does not seem to be a good function which takes a directory structure like mine to a dataset which can be utilized by Tensorflow 1.9, or is there?. I know of the tf.keras.preprocessing.image.ImageDataGenerator
and image_dataset_from_directory
as well as flow_from_directory()
, however, all of them don't seem to bring me my desired dataset value tuple format.
I would really appreciate any help!
CodePudding user response:
You have to build a custom data generator for that. If you have two arrays, train_paths
containing the paths to images and train_labels
containing the labels for the images, then this function (datagen
) would yield the images as array and with their respective label as a tuple (image_array, label)
.
And I have also added a way to integer-encode your labels, with a dictionary encode_label
For example, train_paths
and train_labels
should look like this:
train_paths = np.array(['path/to/image1.jpg','path/to/image2.jpg','path/to/image3.jpg'])
train_labels = np.array(['circle','square','hexagon'])
where the image of path 'path/to/image1.jpg' has a label of 'circle', the image of path 'path/to/image2.jpg' has a label of 'square'.
This generator function will return data as a batch and you can write your custom augmentation techniques as well (inside the augment
function)
import tensorflow as tf
# Hyperparameters
HEIGHT = 224 # Image height
WIDTH = 224 # Image width
CHANNELs = 3 # Image channels
# This function will encode your labels
encode_label = {'hexagon':0, 'circle':1, 'square':2}
def augment(image):
# All your augmentation techniques are done here
return image
def encode_labels(labels):
encoded = []
for label in labels:
encoded.append(encode_label[label])
return encoded
def open_images(paths):
'''
Given a list of paths to images, this function loads
the images from the paths, then augments them, then returns it as a batch
'''
images = []
for path in paths:
image = tf.keras.preprocessing.image.load_img(path, target_size=(HEIGHT, WIDTH, CHANNELS))
image = np.array(image)
image = augment(image)
images.append(image)
return np.array(images)
# This is the data generator
def datagen(paths, labels, batch_size=32):
for x in range(0,len(paths), batch_size):
# Load batch of images
batch_paths = paths[x:x batch_size]
batch_images = open_images(batch_paths)
# Load batch of labels
batch_labels = labels[x:x batch_size]
batch_labels = encode_labels(batch_labels)
batch_labels = np.array(batch_labels, dtype='float').reshape(-1)
yield batch_images, batch_labels
If you cannot get tf.keras.preprocessing.image.load_img
working in your tensorflow version, try using an alternative method to load image and resize it. One alternative way would be to load the image with matplotlib and then resizing it with skimage. So the open_images
function would be this:
import matplotlib
from skimage.transform import resize
def open_images(paths):
'''
Given a list of paths to images, this function loads
the images from the paths, then augments them, then returns it as a batch
'''
images = []
for path in paths:
image = matplotlib.image.imread(path)
image = np.array(image)
image = resize(image, (HEIGHT, WIDTH, CHANNELS))
image = augment(image)
images.append(image)
return np.array(images)