I have a dataframe that looks like this:
player school team result exam
a s1 z False English
a s1 z True German
a s1 z True Geography
b s1 z True Geography
b s1 z True History
b s1 z False English
c s1 y False English
d s1 y False German
d s1 y True English
d s1 y True History
e s1 w True German
e s1 w True History
f s1 w False English
The school
is always the same. I want to compute the performance for every player
, by the formula (number of Trues) / (number of Trues number of Falses)
. But then I want to see the average per team. I would imagine something like (I hope I didn't make any computation error):
school team TeamResult
s1 z 0.67
s1 y 0.67
s1 w 0.50
Does anyone know how I could "go back" to the team the users are from? What I tried so far was extremely inefficient, namely creating new dataframes from the "big" one for each team, like df_new = df[df['team'] == 'z']
. Is there any other more direct way?
CodePudding user response:
You could use groupby
mean
:
df.groupby(['school', 'team'])['result'].mean()
To match the exact specified output:
(df.groupby(['school', 'team'])
['result'].mean()
.rename('TeamResult')
.sort_values(ascending=False)
.reset_index()
)
output:
school team TeamResult
0 s1 w 0.666667
1 s1 z 0.666667
2 s1 y 0.500000