Loss is nan after a few ephocs-CodePudding

I'm trying to predict a direction given by two angles (theta and phi). I defined a loss function which is the angular distance between the predicted and the true direction but I continue to get nan value after a few epochs.

Given that the output activation is linear, my custom loss is:

import tensorflow as tf
import numpy as np
import tensorflow.keras.backend as K

def square_angular_distance(y_true, y_pred):

    the_pred = K.abs(y_pred[:, 0])
    phi_pred = (y_pred[:, 1])%(2*np.pi)

    the_true = y_true[:, 0]
    phi_true = y_true[:, 1]

    cos_phi_pred = K.cos(phi_pred)
    cos_phi_true = K.cos(phi_true)
    sin_phi_pred = K.sin(phi_pred)
    sin_phi_true = K.sin(phi_true)

    cos_the_pred = K.cos(the_pred)
    cos_the_true = K.cos(the_true)
    sin_the_pred = K.sin(the_pred)
    sin_the_true = K.sin(the_true)

    v_true = K.stack((sin_the_true*cos_phi_true, sin_the_true*sin_phi_true, cos_the_true), axis=1)
    v_pred = K.stack((sin_the_pred*cos_phi_pred, sin_the_pred*sin_phi_pred, cos_the_pred), axis=1)

    v_dot = K.batch_dot(v_true, v_pred)

    angle_dist = tf.math.acos(K.clip(v_dot, -1., 1.))*180./np.pi

return K.mean(K.square(angle_dist), axis=-1)

Where y_pred[:, 0] and y_pred[:, 1] are respectively the theta and phi angles of a unitary vector (same for y_true).

I tried to use regularizers, to clip the gradient, the learning rate and I also checked the data to have no Nan/Inf values.

I also tried to clip the output values using a custom activation function for the output layer but it didn't resolve the problem.

Any suggestions on what am I doing wrong?

CodePudding user response：

The comment from @ATony resolved the problem.

Shortening the input domain of tf.math.acos prevented the loss to be Nan.

K.clip(v_dot, -.999, .999)