Home > Software design >  How to retain blanks in group by in pandas?
How to retain blanks in group by in pandas?

Time:03-10

I need to groupby my DataFrame in pandas but when I am doing that the null values are converting into zero, but I want to retain nulls. I am not sure how to do that in pandas.

Input:

Id  Country  Product  sales  qty  price
1   Germany  shoes    32      1   NaN
1   Germany  shoes    32      1    2
2   England  Shoes    22      1   NaN
2   England  Shoes    22      1   NaN
3   Austria  Shoes    0       3   NaN
3   Austria  Shoes    NaN     NaN NaN

Desired output:

Id  Country  Product  sales  qty  price
1   Germany  shoes    64      2   2
2   England  Shoes    44      2   NaN
3   Austria  Shoes    0       3   NaN

CodePudding user response:

Use parameter min_count=1 in sum:

df = df.groupby(['Id','Country','Product'], as_index=False).sum(min_count=1)
print (df)
   Id  Country Product  sales  qty  price
0   1  Germany   shoes   64.0  2.0    2.0
1   2  England   Shoes   44.0  2.0    NaN
2   3  Austria   Shoes    0.0  3.0    NaN

CodePudding user response:

You could mask it using isna group all

out = (df.groupby(['Id','Country','Product']).sum()
       .mask(df[['sales','qty','price']].isna()
             .groupby([df['Id'], df['Country'], df['Product']]).all())
       .reset_index())

The same idea written differently:

cols = ['Id','Country','Product']
g = df.groupby(cols)
out = (g.sum()
       .mask(g.apply(lambda x: x.drop(columns=cols).isna().all()))
       .reset_index())

Output:

   Id  Country Product  sales  qty  price
0   1  Germany   shoes   64.0  2.0    2.0
1   2  England   Shoes   44.0  2.0    NaN
2   3  Austria   Shoes    0.0  3.0    NaN
  • Related