Say I'm doing a standard DNN classification task, and I'm using the cross-entropy loss. After loss calculation, I apply a mask vector([0, 0, 0, 1, 1, ...] to the loss to set some of the loss to zero.
the question is how will Tensorflow handle this zero loss? Will it be involved in back propagation or not?
CodePudding user response:
Yes, tensorflow will be able to handle this. The gradients leading to the masked loss values will then just be 0 because they did not influence the loss values.
CodePudding user response:
Applying a mask to the loss of your model after you have calculated the actual loss essentially means that the zero elements of the gradient are omitted during backward propagation. For example, it is a very common approach to apply a mask vector to the loss when dealing with time series data, which is usually padded to have the same length. These extra zero values are of no use to your model when calculating your gradients and are therefore ignored.