Home > Software engineering >  Tensorflow ReLU output activation returns NaN
Tensorflow ReLU output activation returns NaN

Time:11-15

I have a yolo-like network architecture, where on the output layer I want to predict bounding boxes with coordinates such as x,y,width,height. When I use a linear activation function everything works fine, but my model sometimes predicts negative values which dont make sense in my case, as all values to predict are between 0 and 1 for x,y and are 3 or 5 for width and height. I thought I could instead use a ReLU activation for my output but if I do my network gets stuck with NaN as a loss value.

Any ideas to why that could be ?

CodePudding user response:

Maybe learning rate is too high?

CodePudding user response:

It is hard to say without you giving more data to us. However, it seems to be a common problem in cases where the input data is not normalized correctly.

Here are some links to maybe look into. If it does not help you will probably have to give more info for someone to give a usefull answer

https://discuss.tensorflow.org/t/getting-nan-for-loss/4826

NaN loss when training regression network

  • Related