Home > Blockchain >  How does the Keras method image_dataset_from_directory() distinguish X and Y data?
How does the Keras method image_dataset_from_directory() distinguish X and Y data?

Time:12-16

I'm using the ADE20K dataset to train a Unet like model for segmentation in Keras.

The dataset has over 1000 classes. I'm trying to use the Keras method image_dataset_from_directory() to load the dataset into a tf.dataset object.

The following documentation shows you how to load and pass this dataset object into your model : https://keras.io/api/preprocessing/

# directory for training data
training_data/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg
etc.


from tensorflow import keras
from tensorflow.keras.preprocessing import image_dataset_from_directory

train_ds = image_dataset_from_directory(
    directory='training_data/',
    labels='inferred',
    label_mode='categorical',
    batch_size=32,
    image_size=(256, 256))
validation_ds = image_dataset_from_directory(
    directory='validation_data/',
    labels='inferred',
    label_mode='categorical',
    batch_size=32,
    image_size=(256, 256))

model = keras.applications.Xception(weights=None, input_shape=(256, 256, 3), classes=10)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
model.fit(train_ds, epochs=10, validation_data=validation_ds)

In the above example, it built a dataset object based on the folder structure provided, where each class is a folder in the directory. In my case, I have a directory like this:

ADE20k_Data/
...cars/
......image_1.jpg
......image_1_segmentation.png
......image_2.jpg
......image_2_segmentation.png
...resteraunt/
......image_1.jpg
......image_1_segmentation.png
......image_2.jpg
......image_2_segmentation.png
etc.

Where in each class folder I have both X and Y (or the raw image and the segmented image).

If I load my dataset according to the above example, and pass it into the .fit() method, how is X and Y distinguished?

I guess that is where my confusion lies. How to properly arrange your data's directory structure for image segmentation.

CodePudding user response:

The way you use it will prepare the data for classification, not for segmentation. It will use images as X and "resteraunt", "cars" as labels for classification in Y.

I suggest you to create your own tf.Dataset

Considering your folder structure and assuming all your images are "*.jpg" and each has a "*_segmentation.png" pair you can use the following code to find all the images and corresponding segmentation masks.

import glob
jpgs = glob.glob('ADE20k_Data/*/*.jpg')
pngs = [f.split('.jpg')[0]   "_segmentation"   ".png" for f in jpgs]

Then you can create your Dataset from this data

import tensorflow as tf
dataset = tf.data.Dataset.from_tensor_slices((jpgs, pngs))

At this point if you would do something like

for pair in dataset.take(1):
    print(pair)

It will give you one pair of tensors, first contains path to the image, second contains path to the corresponding segmentation mask.

Further you can read image from the path, for example like this

def read_images(img_path, segmentation_mask_path):
    img_data = tf.io.read_file(img_path)
    img = tf.io.decode_jpeg(img_data)
    
    segm_data = tf.io.read_file(segmentation_mask_path)
    segm_mask = tf.io.decode_png(segm_data)
    
    return img, segm_mask

dataset = dataset.map(read_images)

Next you can do some preprocessing for your model

HEIGHT = 256
WIDTH = 256

def prepare_images(img, semg_mask):
    img = tf.image.resize(img, [HEIGHT, WIDTH])
    semg_mask = tf.image.resize(semg_mask, [HEIGHT, WIDTH], method='nearest')
    return img, semg_mask


dataset = dataset.map(prepare_images)

At this point if you would take one instance from your dataset

for pair in dataset.take(1):
    print(pair)

It will give you a pair of tensors, first contain input image and second contains segmentation mask as your output.

Obviously you will need lots of other stuff, like selecting the right network architecture, normalising input images (just divide img by 255), splitting your dataset into train/val/test, shuffling the training data, batching. But you can achieve this using tf.data api, for example dataset = dataset.batch(batch_size) will generate you X and Y in batches as your model require. https://www.tensorflow.org/api_docs/python/tf/data/Dataset

And then just pass your dataset into the fit method as you already do model.fit(daseset, epoches=10)

  • Related