Home > Mobile >  Computation within a column and averaging based on another column
Computation within a column and averaging based on another column

Time:10-15

I have a dataframe that looks like this:

player   school    team     result     exam
a        s1        z        False      English
a        s1        z        True       German
a        s1        z        True       Geography
b        s1        z        True       Geography
b        s1        z        True       History
b        s1        z        False      English
c        s1        y        False      English
d        s1        y        False      German
d        s1        y        True       English
d        s1        y        True       History
e        s1        w        True       German
e        s1        w        True       History
f        s1        w        False      English

The school is always the same. I want to compute the performance for every player, by the formula (number of Trues) / (number of Trues number of Falses). But then I want to see the average per team. I would imagine something like (I hope I didn't make any computation error):

school    team    TeamResult
s1        z       0.67
s1        y       0.67
s1        w       0.50

Does anyone know how I could "go back" to the team the users are from? What I tried so far was extremely inefficient, namely creating new dataframes from the "big" one for each team, like df_new = df[df['team'] == 'z']. Is there any other more direct way?

CodePudding user response:

You could use groupby mean:

df.groupby(['school', 'team'])['result'].mean()

To match the exact specified output:

(df.groupby(['school', 'team'])
   ['result'].mean()
   .rename('TeamResult')
   .sort_values(ascending=False)  
   .reset_index()
)

output:

  school team  TeamResult
0     s1    w    0.666667
1     s1    z    0.666667
2     s1    y    0.500000
  • Related