Home > OS >  Pandas: Calculate true positive rate for each row
Pandas: Calculate true positive rate for each row

Time:09-22

I have a dataframe like this, with one column being the label and the other columns being predictions

    label   pred1   pred2   pred3
0   Apple   Apple  Orange   Apple
1  Orange  Orange  Orange  Orange

I would like to extend this dataframe with the true positive rate (TP/TP FN) for each row. This column should look like this:

   Score
0   0.66
1   1.00

I am unsure on how to go on about this. Are there pandas functions that would help with this task?

Executable code: https://www.online-python.com/WP7wbgcqMS

CodePudding user response:

Here is one approach where we convert the data to long format and check if the label equals the prediction. The average of the True/False values will be your Score.

import pandas as pd

d = {'Label': ['Apple','Orange'], 'pred1': ['Apple','Orange'], 'pred2': ['Orange','Orange'], 'pred3': ['Apple','Orange']}
df = pd.DataFrame(data=d)

df = df.melt(id_vars='Label', value_name='pred')
df['match'] = df['Label'].eq(df['pred'])
df.groupby('Label')['match'].mean().reset_index(name='Score')

Output

    Label     Score
0   Apple  0.666667
1  Orange  1.000000

CodePudding user response:

maybe like this

temp = df.T.apply(lambda x: x[0]==x).astype(int)
(temp.sum()-1)/(temp.count()-1)

Out:

0    0.666667
1    1.000000

  • Related