Home > Blockchain >  in Python, I want NaNs to sum to NaNs instead of zero when using np.sum in groupby([ ])[ ].transform
in Python, I want NaNs to sum to NaNs instead of zero when using np.sum in groupby([ ])[ ].transform

Time:06-07

I need to use the transform function to sum column 'bar' when I group by column 'foo'. I use the following code

df.groupby(['foo'])['bar'].transform(np.sum)

However, when all the values in 'bar' are NaNs, my desired output is NaNs but the above code returns zero instead. How can I fix this? I know in the sum function I can use min_count = 1 but I am not sure how to use that in the above context.

CodePudding user response:

sum method has min_count argument that controls the required number of non nan values to sum. If there are fewer than min_count non nan values, the result is nan.

# at least one non nan value must be there in order to sum
df.groupby(['foo'])['bar'].transform('sum', min_count=1)
  • Related