Segmentation of German Asphalt Pavement Distress Dataset (GAPs) using U-Net-CodePudding

I'm trying to train a U-Net like model to segment the German Asphalt Pavement Distress Dataset.

Mask images are stored as grey value images. Coding of the grey values:

0 = VOID, 1 = intact road, 2 = applied patch, 3 = pothole, 4 = inlaid patch, 5 = open joint, 6 = crack 7 = street inventory

I found the following colab notebook which was implementing U-Net segmentation on Oxford pets dataset: https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/vision/ipynb/oxford_pets_image_segmentation.ipynb

I modified the notebook to fit my problem of GAPs segmentation, and this is a link to my modified notebook: https://colab.research.google.com/drive/1YfM4lC78QNdfbkgz-1LGSKaBG4-65dkC?usp=sharing

The training is running, but while the loss is decreasing, but accuracy is never increasing above 0.05. I'm stuck with this issue for days now, and I need help about how to get the model to train properly.

The following is a link to the dataset images and masks: https://drive.google.com/drive/folders/1-JvLSa9b1falqEake2KVaYYtyVh-dgKY?usp=sharing

CodePudding user response：

In the Sequence class, you do not shuffle the content of the batches, only the batch order is shuffled with the fit method. You have to shuffle the order of all the data at each epoch. Here is a way to do it in a Sequence subclass:

class OxfordPets(keras.utils.Sequence):
    """Helper to iterate over the data (as Numpy arrays)."""

    def __init__(self, batch_size, img_size, input_img_paths, target_img_paths):
        self.batch_size = batch_size
        self.img_size = img_size
        self.input_img_paths = input_img_paths
        self.target_img_paths = target_img_paths
        self.set_len = len(self.target_img_paths) // self.batch_size
        self.indices = random.sample(range(self.set_len), k=self.set_len)

    def __len__(self):
        return self.set_len

    def __getitem__(self, idx):
        """Returns tuple (input, target) correspond to batch #idx."""
        i = idx * self.batch_size
        indices = self.indices[i : i   self.batch_size]
        batch_input_img_paths = [self.input_img_paths[k] for k in indices]
        batch_target_img_paths = [self.target_img_paths[k] for k in indices]
        x = np.zeros((self.batch_size,)   self.img_size   (3,), dtype="float32")
        for j, path in enumerate(batch_input_img_paths):
            img = load_img(path, target_size=self.img_size)
            x[j] = img
        y = np.zeros((self.batch_size,)   self.img_size   (1,), dtype="uint8")
        for j, path in enumerate(batch_target_img_paths):
            img = load_img(path, target_size=self.img_size, color_mode="grayscale")
            y[j] = np.expand_dims(img, 2)
            # Ground truth labels are 1, 2, 3. Subtract one to make them 0, 1, 2:
            #y[j] -= 1 # I commented this line out because the ground truth labels of GAPs dataset are 0, 1, 2, 3, 4, 5, 6, 7
        return x, y

    def on_epoch_end(self):
      self.indices = random.sample(range(self.set_len), k=self.set_len)

self.indices is a random shuffle of the all indices range(self.set_len) and it is built in the constructor and at the end of each epoch. This permits to shuffle the order of all the data.

Using rmsprop optimizer, it works then :

Epoch 1/15
88/88 [==============================] - 96s 1s/step - loss: 1.9617 - categorical_accuracy: 0.9156 - val_loss: 5.8705 - val_categorical_accuracy: 0.9375
Epoch 2/15
88/88 [==============================] - 93s 1s/step - loss: 0.4754 - categorical_accuracy: 0.9369 - val_loss: 1.9207 - val_categorical_accuracy: 0.9375
Epoch 3/15
88/88 [==============================] - 94s 1s/step - loss: 0.4497 - categorical_accuracy: 0.9447 - val_loss: 9.3833 - val_categorical_accuracy: 0.9375
Epoch 4/15
88/88 [==============================] - 94s 1s/step - loss: 0.3173 - categorical_accuracy: 0.9423 - val_loss: 14.2518 - val_categorical_accuracy: 0.9369
Epoch 5/15
88/88 [==============================] - 94s 1s/step - loss: 0.0645 - categorical_accuracy: 0.9400 - val_loss: 110.9821 - val_categorical_accuracy: 0.8963

Note that there is very quickly some overfitting.