Home > Back-end >  How to group by one column if condition is true in another column summing values in third column wit
How to group by one column if condition is true in another column summing values in third column wit

Time:05-04

I can't think of how to do this: As the headline explains I want to group a dataframe by the column acquired_month only if another column contains Closed Won(in the example I made a helper column that just marks True if that condition is fulfilled although I'm not sure that step is necessary). Then if those conditions are met I want to sum the values of a third column but can't think how to do it. Here is my code so far:

us_lead_scoring.loc[us_lead_scoring['Stage'].str.contains('Closed Won'), 'closed_won_binary'] = True acquired_date = us_lead_scoring.groupby('acquired_month')['closed_won_binary'].sum()

but this just sums the true false column not the sum column if the true false column is true after the acquired_month groupby. Any direction appreciated.

Thanks

CodePudding user response:

If need aggregate column col replace non matched values to 0 values in Series.where and then aggregate sum:

us_lead_scoring = pd.DataFrame({'Stage':['Closed Won1','Closed Won2','Closed', 'Won'],
                                'col':[1,3,5,6],
                                'acquired_month':[1,1,1,2]})

out = (us_lead_scoring['col'].where(us_lead_scoring['Stage']
                             .str.contains('Closed Won'), 0)
                             .groupby(us_lead_scoring['acquired_month'])
                             .sum()
                             .reset_index(name='SUM'))
     
print (out)
   acquired_month  SUM
0               1    4
1               2    0
  • Related