How to generate sklearn classification report for multiclass multioutput data-CodePudding

I am using the below code to calculate F1 score for my dataset

from sklearn.metrics import f1_score
from sklearn.preprocessing import MultiLabelBinarizer

m = MultiLabelBinarizer().fit(y_test_true_f)

print("F1-score is : {:.1%}".format(f1_score(m.transform(y_test_true_f),
         m.transform(y_pred_f),
         average='macro')))

and classification report

from sklearn.metrics import classification_report
print(classification_report(m.transform(y_test_true_f), m.transform(y_pred_f)))

but the output of the classification report does not show the label names

                 precision  recall   f1-score   support

           0       0.88      1.00      0.94        15
           1       1.00      0.95      0.98        22
           2       0.82      0.74      0.78        19
           3       0.90      0.85      0.88        33
           4       0.68      0.87      0.76        15
           5       0.94      0.98      0.96        46
           6       0.83      0.94      0.88        16
           7       0.33      0.86      0.48         7
           8       0.95      0.90      0.92        20
           9       0.67      1.00      0.80        10
          10       0.91      0.83      0.87        12
          11       0.29      0.33      0.31         6
          12       0.25      0.40      0.31         5
          13       0.00      0.00      0.00         3
          14       0.88      1.00      0.93         7
          15       0.50      0.75      0.60         8
          16       0.50      1.00      0.67         1
          17       1.00      1.00      1.00        10
          18       0.80      1.00      0.89         8
          19       0.89      1.00      0.94        17
          20       0.88      1.00      0.94        15
          21       0.86      0.80      0.83        15
          22       0.71      0.79      0.75        19
          23       0.65      1.00      0.79        11
          24       0.74      0.82      0.78        17
          25       1.00      1.00      1.00        11
          26       0.75      0.86      0.80        14

How shall I update my code to see the label names instead of numbers 0,1,2,3.....?

CodePudding user response：

According to output there are 27 classes in the dataset if am not wrong. For getting the classes name/label you need to use attribute of MultiLabelBinarizer to get the mapping of class and 0,1,2,3,... because it transform label into 1,2,3,... numeric type

Attribute is .classes_, you could add this as an parameter in your classification_report as follows:

print(classification_report(m.transform(y_test_true_f), m.transform(y_pred_f)),target_names=m.classes_)

I hope this could give you classes label.

CodePudding user response：

Specify them as target_names when calling classification_report.

From their examples:

>>> from sklearn.metrics import classification_report
>>> y_true = [0, 1, 2, 2, 2]
>>> y_pred = [0, 0, 2, 2, 1]
>>> target_names = ['class 0', 'class 1', 'class 2']
>>> print(classification_report(y_true, y_pred, target_names=target_names))
              precision    recall  f1-score   support

     class 0       0.50      1.00      0.67         1
     class 1       0.00      0.00      0.00         1
     class 2       1.00      0.67      0.80         3

    accuracy                           0.60         5
   macro avg       0.50      0.56      0.49         5
weighted avg       0.70      0.60      0.61         5