My df looks like this:
value type
12 x
34 z
54 x
14 y
I want to create a new column df.sum
where I want to do a sum of the value col but only where the type == x
. The remaining rows should be empty. So for example, the output should be like this:
value type sum
12 x 86
34 z
54 x 86
14 y
CodePudding user response:
If you want to handle a single type (only x
):
mask = df['type'].eq('x')
df.loc[mask, 'sum'] = df.loc[mask, 'value'].sum()
if you might need to handle several:
types = ['x'] # add others, e.g.: types = ['x', 'y']
df.loc[df['type'].isin(types), 'sum'] = (df.groupby('type')['value']
.transform('sum')
)
output:
value type sum
0 12 x 66.0
1 34 z NaN
2 54 x 66.0
3 14 y NaN
CodePudding user response:
We can use boolean indexing with loc
:
m = df['type'].eq('x')
df.loc[m, 'sum'] = df.loc[m, 'value'].sum()