Question: Output the ID of the user and a total number of actions they performed for tasks they completed(action_name column="CompleteTask"). If a user from this company(ClassPass) did not complete any tasks in the given period of time, you should still output their ID and the number 0 in the second column.
dataset:
expected result:
CodePudding user response:
Considering your initial dataframe is named df
, you can try this :
out = (df.groupby(['user_id'], as_index=False)
.apply(lambda x: x[x['action_name'] == 'CompleteTask' ]['num_actions'].sum())
.rename(columns={None: 'total_actions'})
)