I have the following table
Player_Id MONTH_LABEL AMT_TC
108 APR_2022 26393.0
108 FEB_2022 13984.0
108 JAN_2022 16139.0
108 JUN_2022 6194.0
108 MAR_2022 19920.0
109 APR_2022 46393.0
109 FEB_2022 23984.0
109 JAN_2022 36139.0
109 JUN_2022 11194.0
109 MAR_2022 19920.0
I need to calculate a column called mean, std which calculates the value based on the Player_ID group keeping all the other columns same.
The expected output
Player_Id MONTH_LABEL AMT_TC Mean STD
108 APR_2022 26393.0 16526 456
108 FEB_2022 13984.0 16526 456
108 JAN_2022 16139.0 16526 456
108 JUN_2022 6194.0 16526 456
108 MAR_2022 19920.0 16526 456
109 APR_2022 46393.0 27526 1211
109 FEB_2022 23984.0 27526 1211
109 JAN_2022 36139.0 27526 1211
109 JUN_2022 11194.0 27526 1211
109 MAR_2022 19920.0 27526 1211
I tried using aggregate function but the old columns go away.
prepm_Month.groupby(['Wh_Player_Id', 'User_Name', 'TIME_SEGMENT']).transform({'BET_AMT_TC': 'mean',
'BET_AMT_TC': 'std'})
CodePudding user response:
You only need to group by the Player ID:
amt = df.groupby("Player_Id")["AMT_TC"]
df["Mean"] = amt.transform("mean")
df["STD"] = amt.transform("std")
CodePudding user response:
You can try join
on groupby.agg
out = df.join(df.groupby('Player_Id').agg(Mean=('AMT_TC', 'mean'), Std=('AMT_TC', 'std')), on='Player_Id')
print(out)
Player_Id MONTH_LABEL AMT_TC Mean Std
0 108 APR_2022 26393.0 16526.0 7453.835959
1 108 FEB_2022 13984.0 16526.0 7453.835959
2 108 JAN_2022 16139.0 16526.0 7453.835959
3 108 JUN_2022 6194.0 16526.0 7453.835959
4 108 MAR_2022 19920.0 16526.0 7453.835959
5 109 APR_2022 46393.0 27526.0 13849.897852
6 109 FEB_2022 23984.0 27526.0 13849.897852
7 109 JAN_2022 36139.0 27526.0 13849.897852
8 109 JUN_2022 11194.0 27526.0 13849.897852
9 109 MAR_2022 19920.0 27526.0 13849.897852