I am working on a facial landmark detection CNN. The dataset consist of images with 15 landmarks (each with x,y coordinates) and some images that only have labels for 4 of those 15 landmarks.
Instead of filling the missing values or only training on the clean data I want to use padding and masking for the output layer to only train on the labels that are there for any given image. For example for images with only 4 labels the model should just ignore the other outputs for calculating the loss and backpropagating.
Currently the model architecture looks something like this:
from keras.models import Sequential
from keras.layers import Convolution2D, BatchNormalization, Flatten, Dense, Dropout, MaxPool2D, LeakyReLU
model = Sequential()
model.add(Convolution2D(32, (3, 3), padding='same', use_bias=False, input_shape=(96, 96, 1)))
model.add(LeakyReLU(alpha=0.1))
model.add(BatchNormalization())
model.add(Convolution2D(32, (3, 3), padding='same', use_bias=False))
model.add(LeakyReLU(alpha=0.1))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2, 2)))
# more Conv2d layers and so on....
model.add(Convolution2D(512, (3, 3), padding='same', use_bias=False))
model.add(LeakyReLU(alpha=0.1))
model.add(BatchNormalization())
model.add(Convolution2D(512, (3, 3), padding='same', use_bias=False))
model.add(LeakyReLU(alpha=0.1))
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(30))
How do I do this with keras? Is there a somehow "proper way" to do it without writing my own loss function?
CodePudding user response:
You can define a custom loss function. A trick could be to set the masking value to -1. (or -1) in y_true (input labels)
def mycustomloss(y_true, y_pred):
loss = tf.abs(y_true - y_pred) # Mean Abolute Error as an example, put your loss here, without reduction!
loss = tf.where(y_true !=-1., loss, 0.)
return tf.reduce_mean(loss)
model.compile(optimizer='adam', loss=mycustomloss)
Pay attention to the type of the labels (integers or floats) and plug your loss as shown without reduction, i.e. same input and output shapes