How to group by one column if condition is true in another column summing values in third column wit-CodePudding

I can't think of how to do this: As the headline explains I want to group a dataframe by the column acquired_month only if another column contains Closed Won(in the example I made a helper column that just marks True if that condition is fulfilled although I'm not sure that step is necessary). Then if those conditions are met I want to sum the values of a third column but can't think how to do it. Here is my code so far:

us_lead_scoring.loc[us_lead_scoring['Stage'].str.contains('Closed Won'), 'closed_won_binary'] = True acquired_date = us_lead_scoring.groupby('acquired_month')['closed_won_binary'].sum()

but this just sums the true false column not the sum column if the true false column is true after the acquired_month groupby. Any direction appreciated.

Thanks

CodePudding user response：

If need aggregate column col replace non matched values to 0 values in Series.where and then aggregate sum:

us_lead_scoring = pd.DataFrame({'Stage':['Closed Won1','Closed Won2','Closed', 'Won'],
                                'col':[1,3,5,6],
                                'acquired_month':[1,1,1,2]})

out = (us_lead_scoring['col'].where(us_lead_scoring['Stage']
                             .str.contains('Closed Won'), 0)
                             .groupby(us_lead_scoring['acquired_month'])
                             .sum()
                             .reset_index(name='SUM'))
     
print (out)
   acquired_month  SUM
0               1    4
1               2    0