I have a dataframe:
id day action_type
1 0 upload
1 0 upload
1 0 upload
1 1 upload
1 1 upload
2 0 upload
2 0 upload
2 1 upload
How to change my query to get table with unique days in column day and average number "upload" action_type among all id's. So desired result must look like this:
day avg_num_action
0 2.5
1 1.5
It is 2.5, because (3 2)/2 (3 uploads of id:1 and 2 uploads for id:2). same for 1.5
How to do that in pandas?
CodePudding user response:
Try crosstab
pd.crosstab(df.id, df.day).mean()
CodePudding user response:
Assuming df
is your dataframe:
df.groupby(['day', 'id']).count().groupby('day').mean()
CodePudding user response:
With a groupby, and unstack:
result = df.groupby(['id', 'day']).size().unstack('day').mean()
result
day
0 2.5
1 1.5
dtype: float64
To match your expected output,add reset_index:
result.reset_index(name = 'avg_num_action')
day avg_num_action
0 0 2.5
1 1 1.5