Keras Loss is Negative (Binary CrossEntropy, double output model)-CodePudding

I have a double output model, one for regression and another for classification. Loss is MSE for the regression output and Binary Crossentropy for the classification output. However the loss is always negative. I have read that this can happen with a sigmoid activation function and binary cross entropy if the response variable is not between 1 and 0. I have checked my input and output data below, but they are between 0 and 1 so I am unsure where this issue stems from.

Loss:

Epoch 1/300
69/69 - 30s - loss: 3.4360 - elicat_loss: 3.3823 - amd_loss: 0.0538 - elicat_root_mean_squared_error: 2.2046 - amd_accuracy: 0.0725 - amd_auc: 0.0000e 00 - val_loss: 2.9322 - val_elicat_loss: 3.1193 - val_amd_loss: -1.8714e-01 - val_elicat_root_mean_squared_error: 2.0623 - val_amd_accuracy: 0.2393 - val_amd_auc: 0.0000e 00 - 30s/epoch - 432ms/step
Epoch 2/300
69/69 - 11s - loss: 2.3846 - elicat_loss: 2.9118 - amd_loss: -5.2726e-01 - elicat_root_mean_squared_error: 1.9905 - amd_accuracy: 0.2736 - amd_auc: 0.0000e 00 - val_loss: 2.1002 - val_elicat_loss: 2.8091 - val_amd_loss: -7.0891e-01 - val_elicat_root_mean_squared_error: 1.9249 - val_amd_accuracy: 0.2991 - val_amd_auc: 0.0000e 00 - 11s/epoch - 155ms/step
Epoch 3/300
69/69 - 11s - loss: 1.7066 - elicat_loss: 2.7191 - amd_loss: -1.0124e 00 - elicat_root_mean_squared_error: 1.8944 - amd_accuracy: 0.2975 - amd_auc: 0.0000e 00 - val_loss: 1.4961 - val_elicat_loss: 2.6764 - val_amd_loss: -1.1802e 00 - val_elicat_root_mean_squared_error: 1.8619 - val_amd_accuracy: 0.2991 - val_amd_auc: 0.0000e 00 - 11s/epoch - 155ms/step
Epoch 4/300
69/69 - 11s - loss: 1.1482 - elicat_loss: 2.6324 - amd_loss: -1.4842e 00 - elicat_root_mean_squared_error: 1.8477 - amd_accuracy: 0.2984 - amd_auc: 0.0000e 00 - val_loss: 0.9810 - val_elicat_loss: 2.6064 - val_amd_loss: -1.6254e 00 - val_elicat_root_mean_squared_error: 1.8275 - val_amd_accuracy: 0.2991 - val_amd_auc: 0.0000e 00 - 11s/epoch - 155ms/step
Epoch 5/300
69/69 - 11s - loss: 0.6688 - elicat_loss: 2.5750 - amd_loss: -1.9062e 00 - elicat_root_mean_squared_error: 1.8182 - amd_accuracy: 0.2984 - amd_auc: 0.0000e 00 - val_loss: 0.5058 - val_elicat_loss: 2.5615 - val_amd_loss: -2.0557e 00 - val_elicat_root_mean_squared_error: 1.8051 - val_amd_accuracy: 0.2991 - val_amd_auc: 0.0000e 00 - 11s/epoch - 154ms/step

Input: (Fed into the model as a singular dataset as shown in the code below, 1st set of data is the image input, 2nd input (elicat) is f for the regression output, 3rd input (AMD) is the classification result)

array([[[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]],], dtype=float32)>, 

<tf.Tensor: shape=(16,), dtype=float64, numpy=
array([1.66666667, 1.        , 1.66666667, 1.        , 1.        ,])>, 

<tf.Tensor: shape=(16,), dtype=float64, numpy=array([0., 0., 0., 0.,])>)

Output: (1st output is the regression with a normal loss output, 2nd output is the classification)

[array([[0.696372  ],
        [0.7079218 ],
        [0.7063726 ],
        [0.6944795 ]], dtype=float32),
 array([[0.9998816 ],
        [0.9999999 ],
        [0.9999999 ],
        [0.99996316]], dtype=float32)]

Model Summary:

 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_1 (InputLayer)           [(None, 300, 300, 3  0           []                               
                                )]                                                                
                                                                                                  
 model (Functional)             (None, 10, 10, 1536  12930622    ['input_1[0][0]']                
                                )                                                                 
                                                                                                  
 global_average_pooling2d (Glob  (None, 1536)        0           ['model[0][0]']                  
 alAveragePooling2D)                                                                              
                                                                                                  
 dropout (Dropout)              (None, 1536)         0           ['global_average_pooling2d[0][0]'
                                                                 ]                                
                                                                                                  
 global_average_pooling2d_1 (Gl  (None, 1536)        0           ['model[0][0]']                  
 obalAveragePooling2D)                                                                            
                                                                                                  
 dense (Dense)                  (None, 1)            1536        ['dropout[0][0]']                
                                                                                                  
 dropout_1 (Dropout)            (None, 1536)         0           ['global_average_pooling2d_1[0][0
                                                                 ]']                              
                                                                                                  
 elicat (Dense)                 (None, 1)            2           ['dense[0][0]']                  
                                                                                                  
 amd (Dense)                    (None, 1)            1536        ['dropout_1[0][0]']              
                                                                                                  
==================================================================================================

Code:

self.model.compile(tf.keras.optimizers.Adam(eval(lr)), 
                   loss = {'elicat': 'mse', 'amd': tf.keras.losses.BinaryCrossentropy()},
                                       
                   metrics = {'elicat': tf.keras.metrics.RootMeanSquaredError(),
                              'amd': ["accuracy",tf.keras.metrics.AUC(),]},
                              weighted_metrics=[],)


self.model.fit(self.train_ds, epochs=epoch, callbacks=callbacks, 
               validation_data=self.val_ds, verbose = 2)

CodePudding user response：

This may be due to the fact that you forgot to normalize your input data. For this I invite you to add a suitable lambda layer as input.

CodePudding user response：

The problem came from how I created the dataset. I zipped the input together with the output in a single dimension when I should have zipped the outputs first, then the input.

Old and wrong implementation:

def getDS(list_ds):
  X = []
  Y1 = []
  Y2 = []
  for item in tqdm(list_ds.take(len(list_ds))):
    img,label1,label2 = process_path(item)
    X.append(img)
    Y1.append(label1)
    Y2.append(label2)

    X = tf.data.Dataset.from_tensor_slices(X)
    Y1 = tf.data.Dataset.from_tensor_slices(Y1)
    Y2 = tf.data.Dataset.from_tensor_slices(Y2)
    return tf.data.Dataset.zip(  ( X, Y1,Y2 )  ).batch(16)

Correct method of zipping:

return tf.data.Dataset.zip(  ( X, tf.data.Dataset.zip((Y1,Y2)) )  ).batch(16)

Thanks for those who helped out!