I'm trying to predict a direction given by two angles (theta and phi). I defined a loss function which is the angular distance between the predicted and the true direction but I continue to get nan value after a few epochs.
Given that the output activation is linear, my custom loss is:
import tensorflow as tf
import numpy as np
import tensorflow.keras.backend as K
def square_angular_distance(y_true, y_pred):
the_pred = K.abs(y_pred[:, 0])
phi_pred = (y_pred[:, 1])%(2*np.pi)
the_true = y_true[:, 0]
phi_true = y_true[:, 1]
cos_phi_pred = K.cos(phi_pred)
cos_phi_true = K.cos(phi_true)
sin_phi_pred = K.sin(phi_pred)
sin_phi_true = K.sin(phi_true)
cos_the_pred = K.cos(the_pred)
cos_the_true = K.cos(the_true)
sin_the_pred = K.sin(the_pred)
sin_the_true = K.sin(the_true)
v_true = K.stack((sin_the_true*cos_phi_true, sin_the_true*sin_phi_true, cos_the_true), axis=1)
v_pred = K.stack((sin_the_pred*cos_phi_pred, sin_the_pred*sin_phi_pred, cos_the_pred), axis=1)
v_dot = K.batch_dot(v_true, v_pred)
angle_dist = tf.math.acos(K.clip(v_dot, -1., 1.))*180./np.pi
return K.mean(K.square(angle_dist), axis=-1)
Where y_pred[:, 0] and y_pred[:, 1] are respectively the theta and phi angles of a unitary vector (same for y_true).
I tried to use regularizers, to clip the gradient, the learning rate and I also checked the data to have no Nan/Inf values.
I also tried to clip the output values using a custom activation function for the output layer but it didn't resolve the problem.
Any suggestions on what am I doing wrong?
CodePudding user response:
The comment from @ATony resolved the problem.
Shortening the input domain of tf.math.acos prevented the loss to be Nan.
K.clip(v_dot, -.999, .999)