Questions regarding custom multiclass metrics (Keras)-CodePudding

could anyone explain how to write a custom multiclass metrics for Keras? I tried to write custom metric but encountered some issue. Main problem is I am not familiar with how tensor works during training (I think it is called Graph mode?). I am able to create confusion matrix and derived F1 score using NumPy or Python list.

I printed out the y-true and y_pred and tried to understand them, but the output was not what I expected:

Below is the function I used:

def f1_scores(y_true,y_pred):

    y_true = K.print_tensor(y_true, message='y_true = ')
    y_pred = K.print_tensor(y_pred, message='y_pred = ')
    print(f"y_true_shape:{K.int_shape(y_true)}")
    print(f"y_pred_shape:{K.int_shape(y_pred)}")

    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)

    gt = K.argmax(y_true_f)
    pred = K.argmax(y_pred_f)

    print(f"pred_print:{pred}")
    print(f"gt_print:{gt}")

    pred = K.print_tensor(pred, message='pred= ')
    gt = K.print_tensor(gt, message='gt =')
    print(f"pred_shape:{K.int_shape(pred)}")
    print(f"gt_shape:{K.int_shape(gt)}")

    pred_f = K.flatten(pred)
    gt_f = K.flatten(gt)

    pred_f = K.print_tensor(pred_f, message='pred_f= ')
    gt_f = K.print_tensor(gt_f, message='gt_f =')
    print(f"pred_f_shape:{K.int_shape(pred_f)}")
    print(f"gt_f_shape:{K.int_shape(gt_f)}")

    conf_mat = tf.math.confusion_matrix(y_true_f,y_pred_f, num_classes = 14)

    """
    add codes to find F1 score for each class
    """

    # return an arbitrary number, as F1 scores not found yet.
    return 1

The output at when epoch 1 just started:

y_true_shape:(None, 256, 256, 14)
y_pred_shape:(None, 256, 256, 14)
pred_print:Tensor("ArgMax_1:0", shape=(), dtype=int64)
gt_print:Tensor("ArgMax:0", shape=(), dtype=int64)
pred_shape:()
gt_shape:()
pred_f_shape:(1,)
gt_f_shape:(1,)

Then for the rest of the steps and epochs were similar as below:

y_true =  [[[[1 0 0 ... 0 0 0]
   [1 0 0 ... 0 0 0]
   [1 0 0 ... 0 0 0]
   ...

y_pred =  [[[[0.0889623 0.0624801107 0.0729747042 ... 0.0816219151 0.0735477135 0.0698677748]
   [0.0857798532 0.0721047595 0.0754121244 ... 0.0723947287 0.0728530064 0.0676521733]
   [0.0825942457 0.0670698211 0.0879610255 ... 0.0721599609 0.0845924541 0.0638583601]
   ...

pred=  1283828
gt = 0
pred_f=  [1283828]
gt_f = [0]

Why is pred a number instead of a list of numbers with each number represents index of class? Similarly, why is pred_f is a list with only one number instead of list of indices?

And for gt (and gt_f), why is the value 0? I expect them to be list of indices.

CodePudding user response：

I looks like argmax() simply uses the flattened y.
You need to specify which axis you want argmax() to reduce. Probably it's the last one, in your case 3. Then you'll get pred with a shape (None, 256, 256) containing integer between 0 and 13.
Try something like this: pred = K.argmax(y_pred, axis=3)
This is the documentation for tensorflow argmax. (But I'm not sure if you're using exactly that, since I can not see what K is imported as)