Can I use the output of tf.keras.utils.image_dataset_from_directory to train an autoencoder?-CodePudding

To put it simply, I'd like to be able to use a keras dataset created from a local image directory to train an autoencoder. To clarify, this is a model that approximates the Identity function for images : ideally, the output is exactly equal to the input.

The dataset is too large to fit in memory, so converting the dataset to a numpy array with np.concatenate will not help me here.

Or in other words, I'd like an Identity image dataset, where the label for each image in the dataset is exactly equal to the image itself.

Here's my (non-working) sample code:

train_ds, validate_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  labels=None,
  validation_split=0.1,
  subset="both",
  shuffle=True,
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size,
  crop_to_aspect_ratio=True)

history = autoencoder.fit(
  x=train_ds,
  y=train_ds,
  validation_data=(validate_ds, validate_ds),
  epochs=epochs,
  batch_size=16
)

The image_dataset_from_directory function gives me a dataset of images with no labels. So far so good.

The second command fails with the error message:

ValueError: `y` argument is not supported when using dataset as input.

On the other hand, if I exclude the y variable I get this error:

ValueError: Target data is missing. Your model was compiled with loss=binary_crossentropy, and therefore expects target data to be provided in `fit()`.

Which is not at all surprising, because there are NO labels, as I requested none. But yet it won't let me use the dataset as the labels which is what I need to do.

Any help would be appreciated.

CodePudding user response：

While there are ways to modify the dataset, I think the best option is to write a custom model class. This is modified from the official tutorial:

class Autoencoder(tf.keras.Model):
    def train_step(self, data):
        # Unpack the data. Its structure depends on your model and
        # on what you pass to `fit()`.
        x = data  # CHANGE 1: changed from x, y = data

        with tf.GradientTape() as tape:
            y_pred = self(x, training=True)  # Forward pass
            # Compute the loss value
            # (the loss function is configured in `compile()`)
            loss = self.compiled_loss(x, y_pred, regularization_losses=self.losses)  # CHANGE 2: replaced y by x as label

        # Compute gradients
        trainable_vars = self.trainable_variables
        gradients = tape.gradient(loss, trainable_vars)
        # Update weights
        self.optimizer.apply_gradients(zip(gradients, trainable_vars))
        # Update metrics (includes the metric that tracks the loss)
        self.compiled_metrics.update_state(x, y_pred)  # CHANGE 3: like change 2
        # Return a dict mapping metric names to current value
        return {m.name: m.result() for m in self.metrics}

    def test_step(self, data):
        # CHANGED in the same way
        x = data
        # Compute predictions
        y_pred = self(x, training=False)
        # Updates the metrics tracking the loss
        self.compiled_loss(x, y_pred, regularization_losses=self.losses)
        # Update the metrics.
        self.compiled_metrics.update_state(x, y_pred)
        # Return a dict mapping metric names to current value.
        # Note that it will include the loss (tracked in self.metrics).
        return {m.name: m.result() for m in self.metrics}

This is for the functional API (tf.keras.Model). In case you are using a Sequential model, you should inherit from that instead. You can use this as a direct replacement for the normal model constructor.

Another option could be to use train_zipped = tf.data.Dataset.zip((train_ds, train_ds)) to create an input, target dataset that you can put directly into the usual model and loss function. Personally, I don't like the duplication. Also, I'm not sure if this will behave correctly for the shuffled data (will both copies of train_ds be shuffled in the same way?).
You could circumvent this by setting shuffle=False in image_dataset_from_directory, and then use train_zipped = train_zipped.shuffle(buffer_size) instead. However, in my experience this is very slow.

CodePudding user response：

The target data is missing when invoking the model.fit() method, as shown by this error notice. The binary crossentropy loss function was used in the model's construction, therefore during training, the model expects to receive target data in order to calculate the loss and update its weights.

When executing the model.fit() method, you must provide the target data in order to resolve this problem. By providing a second parameter to the fit() method, such as model.fit(X, y), where X denotes the input data and y denotes the goal data, this may be accomplished. The form of y should correspond to the form of your model's output.

If you are using a sequential model, you must ensure that you have passed the target variable during training and that the last layer of your model has the appropriate number of neurons, which corresponds to the number of classes you want to predict.

Make sure that you have the target data available and that it is in the correct format before calling the model.fit() function.