Home > Software design >  Adding a new column that the values are determined by another column (after groupby)
Adding a new column that the values are determined by another column (after groupby)

Time:07-10

This is the original dataframe looks like , and I want to add a new column called [withdraw_#], which recorded how many times the parent_user_id withdraw their money [I don't know the steps after I groupby('parent_user_id')]

This is the revised dataframe looks like

CodePudding user response:

df['WITHDRAW_#']='' 
df['WITHDRAW_#']=df.groupby(['user','side']).transform('count').fillna(0)
df['WITHDRAW_#']=df['WITHDRAW_#'].fillna(0).astype(int)
print(df)

Input:

    user      side amount
0  10067  WITHDRAW   2000
1  10057   DEPOSIT   5000
2  10067  WITHDRAW   1000
3  10057  WITHDRAW   6000

Output:

    user      side amount  WITHDRAW_#
0  10067  WITHDRAW   2000           2
1  10057   DEPOSIT   5000           1
2  10067  WITHDRAW   1000           2
3  10057  WITHDRAW   6000           1

CodePudding user response:

def fun(x):
    return sum(x=="WITHDRAW")
df["withdraw_#"] = df.groupby("user_id")["side"].agg(fun)

or

df["withdraw_#"] = df.groupby("user_id")["side"].agg(lambda x: sum(x=="WITHDRAW"))
  • Related