Home > Enterprise >  Load TIFF images with Keras ImageDataGenerator
Load TIFF images with Keras ImageDataGenerator

Time:05-24

I'm trying to launch a multi-class training with U-Net in Keras/Tensorflow in Python 3.7. I only have experience performing binary training with .jpg images (images with values ​​in the ranges [0,255]) in grayscale (1 channel), which I loaded with the ImageDataGenerator class.

In this case, I need to load 1-channel .tif images with values ​​ranging from -1000 to 7000. As far as I can tell, ImageDataGenerator loads the images in [0,255], which causes me to lose a lot of information. Is there any way to load those images with the original values ​​using ImageDataGenerator? I know the Pillow library loads them properly, but I have a lot of data and I needed to load that efficiently.

Assuming I have the images in /full/path/to/my/dir/images and the masks in /full/path/to/my/dir/masks, the code I'm using to load the images and the masks are as follows:

from keras_preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator()
train_image_generator = train_datagen.flow_from_directory(
    "/full/path/to/my/dir",
    classes="images",
    batch_size=16,
    color_mode="grayscale",
    target_size=(400, 400),
    class_mode=None,
    seed=100,
    shuffle=True,
)
train_mask_generator = train_datagen.flow_from_directory(
    "/full/path/to/my/dir",
    classes="masks",
    batch_size=16,
    color_mode="grayscale",
    target_size=(400, 400),
    class_mode=None,
    seed=100,
    shuffle=True,
)

Thanks in advance.

CodePudding user response:

In the ImageDataGenerator there is a preprocessor function that you can define. It processes each image and returns the result. Note the processed image must have the same shape as the input image but you can manipulate the pixel values

CodePudding user response:

I tried using the preprocessing function and debugged the input image in that function. This input image already has values in [0.255]. In fact, it looks like the image is loaded as uint8 and then converted to float32 because it has float32 values in [0, 255] without containing any decimal values, that is, always values 0., 1., 2., and so on. on.

Alternatively, I tried to implement the following custom ImageDataGenerator, using this source as a guide:


class CustomImageDataGenerator(tf.keras.utils.Sequence):
    def __init__(self, input_dir,
                 batch_size=None,
                 target_size=(400, 400), shuffle=True, recursive=False,
                 interpolation="nearest",
                 sort_method="numerically", sort_order="ascending",
                 sort_prefix=None, sort_suffix=None,
                 ):
        # List files into the given directory
        self.files = self.__list_files(input_dir, full_path=True,
                                       recursive=recursive,
                                       sort_method=sort_method, sort_order=sort_order,
                                       sort_prefix=sort_prefix, sort_suffix=sort_suffix)
        # Check images and masks sizes
        if len(self.files) == 0:
            raise Exception("No files found at '{}'".format(input_dir))

        self.target_size = target_size
        self.interpolation = self.__get_interpolation(interpolation)
        # Get the size
        self.n = len(self.files)

        # Other attributes
        if batch_size is None:  # Set batch size equal to the length of the data
            batch_size = self.n
        self.batch_size = batch_size

        self.shuffle = shuffle

    def __getitem__(self, index):
        # This method should return one batch of data
        batches = self.files[index * self.batch_size:(index   1) * self.batch_size]
        return self.__get_data(batches)

    def __len__(self):
        # Should return an int value
        return self.n // self.batch_size

    def on_epoch_end(self):
        if self.shuffle:
            random.shuffle(self.files)

    def __get_data(self, batches):
        # Generates data containing batch_size samples
        X_batch = np.asarray([self.__get_input(x) for x in batches])
        return X_batch

    def __get_input(self, path):
        # Load image
        image = np.asarray(Image.open(path))
        # Resize to target size
        if image.shape != self.target_size:
            image = cv2.resize(image, self.target_size, interpolation=self.interpolation)
        # Expand dimensions to get the right shape as the input of the neural network
        if len(image.shape) == 2:  # Append the channel
            image = np.expand_dims(image, axis=2)
        return image

    @staticmethod
    def __get_interpolation(inter_str):
        if inter_str == "nearest":
            inter = cv2.INTER_NEAREST
        elif inter_str == "cubic":
            inter = cv2.INTER_CUBIC
        elif inter_str == "linear":
            inter = cv2.INTER_LINEAR
        else:
            raise Exception(
                "Not supported option. Current: '{}'. Expected: '{}'".format(inter_str,
                                                                             "'nearest', 'cubic', 'linear'"))
        return inter

However, it still doesn't work the way I'd like. The idea is that when I use the Keras ImageDataGenerator as follows:

datagen = ImageDataGenerator()
train_image_generator = datagen.flow_from_directory(
    "/full/path/to/my/dir",
    classes="images",
    batch_size=16,
    color_mode="grayscale",
    target_size=(400, 400),
    class_mode=None,
    seed=100,
    shuffle=True,
)

I can make a next(train_image_generator) and return a batch with the shape (16,128,128,1), which is appropriate. However, when I use my custom generator:

train_image_generator_2 = CustomImageDataGenerator(input_dir="/full/path/to/my/dir",
                                                   batch_size=16,
                                                   target_size=(400, 400),
                                                   shuffle=False,
                                                   )

and I run next(train_image_generator_2), the code returns the following error:

{TypeError}'CustomImageDataGenerator' object is not an iterator

The thing is, I need it to work for me as an iterator so I can review its contents in a loop. I couldn't figure out what it would take for it to work for me as an iterator. Can anyone think of anything?

  • Related