Home > Software engineering >  Python: calculate mean, std within a group and keep the old columns
Python: calculate mean, std within a group and keep the old columns

Time:08-11

I have the following table

   Player_Id    MONTH_LABEL       AMT_TC
    108          APR_2022        26393.0
    108          FEB_2022        13984.0
    108          JAN_2022        16139.0
    108          JUN_2022        6194.0
    108          MAR_2022        19920.0
    109          APR_2022        46393.0
    109          FEB_2022        23984.0
    109          JAN_2022        36139.0
    109          JUN_2022        11194.0
    109          MAR_2022        19920.0

I need to calculate a column called mean, std which calculates the value based on the Player_ID group keeping all the other columns same.

The expected output

   Player_Id    MONTH_LABEL       AMT_TC       Mean       STD
    108          APR_2022        26393.0      16526       456
    108          FEB_2022        13984.0      16526       456
    108          JAN_2022        16139.0      16526       456
    108          JUN_2022        6194.0       16526       456
    108          MAR_2022        19920.0      16526       456
    109          APR_2022        46393.0      27526       1211
    109          FEB_2022        23984.0      27526       1211
    109          JAN_2022        36139.0      27526       1211
    109          JUN_2022        11194.0      27526       1211
    109          MAR_2022        19920.0      27526       1211

I tried using aggregate function but the old columns go away.

prepm_Month.groupby(['Wh_Player_Id', 'User_Name', 'TIME_SEGMENT']).transform({'BET_AMT_TC': 'mean', 
                                 'BET_AMT_TC': 'std'})

CodePudding user response:

You only need to group by the Player ID:

amt = df.groupby("Player_Id")["AMT_TC"]
df["Mean"] = amt.transform("mean")
df["STD"] = amt.transform("std")

CodePudding user response:

You can try join on groupby.agg

out = df.join(df.groupby('Player_Id').agg(Mean=('AMT_TC', 'mean'), Std=('AMT_TC', 'std')), on='Player_Id')
print(out)

   Player_Id MONTH_LABEL   AMT_TC     Mean           Std
0        108    APR_2022  26393.0  16526.0   7453.835959
1        108    FEB_2022  13984.0  16526.0   7453.835959
2        108    JAN_2022  16139.0  16526.0   7453.835959
3        108    JUN_2022   6194.0  16526.0   7453.835959
4        108    MAR_2022  19920.0  16526.0   7453.835959
5        109    APR_2022  46393.0  27526.0  13849.897852
6        109    FEB_2022  23984.0  27526.0  13849.897852
7        109    JAN_2022  36139.0  27526.0  13849.897852
8        109    JUN_2022  11194.0  27526.0  13849.897852
9        109    MAR_2022  19920.0  27526.0  13849.897852
  • Related