Home > Back-end >  calculating F1 score in Excel
calculating F1 score in Excel

Time:03-06

In statistics

if you predict a value 1 and the original value was 1 then this is called True Positive (TP)

if the prediction is 0 and the original value was 1 then it is False Negative (FN)

if the prediction is 1 and the original value was 0 then it is False Positive (FP)

if the prediction is 0 and the original value was 0 then it is True Negative (TN)

F1 score is a method to measure the relation between 2 datasets

Which is calculated like this

enter image description here

I have a file that has the results of 1000 predictions each value is either 0 or 1

that looks like this

Label    0    1    2   ....   0    1    2    ...   
--------------------------------------------------
         0    0    1          0    1    1   
         1    0    0          1    0    1   
         0    1    1          0    0    0   
         0    0    1          0    1    1   
         1    0    0          0    0    0   
         1    0    1          0    1    1   
         0    0    1          1    0    1   
         1    1    1          1    1    0   
         0    1    0          1    0    1   
         1    1    0          0    0    1   
         1    0    0          0    1    0   
         1    1    0          1    1    1   
         0    1    1          1    0    1   
         0    0    1          0    1    1   
         0    1    0          1    0    0   
         1    1    0          0    1    0   
         0    1    1          1    1    1   
         0    0    0          1    0    0   
         0    0    1          0    1    1   
         0    1    1          1    0    0   

The only way I can think of having a 1000 new columns to detect TP and another to detect FP and anorger 1000 for TN and another 1000 for FN

where each has this equation =if(AND(B6=NB6,B6=1), 1, 0) for TP and so on

this is not a good solution

is there a faster easier way to get F1 score for each one of these columns or even better F1 score micro and macro for all?

CodePudding user response:

I'm not going to finish this tonight, but I think the formulas for columns B and E of the test data would be

Precision:

=COUNTIFS(B2:B21,1,E2:E21,1)/(COUNTIFS(B2:B21,1,E2:E21,1) COUNTIFS(B2:B21,0,E2:E21,1))

Recall

=COUNTIFS(B2:B21,1,E2:E21,1)/(COUNTIFS(B2:B21,1,E2:E21,1) COUNTIFS(B2:B21,1,E2:E21,0))

and you can get the others in the same way

enter image description here

  • Related