Home > OS >  How to count values in group by with condition?
How to count values in group by with condition?

Time:10-26

I have dataframe, which was originnaly built up by 2 dataframes - launches.merge(dropped,on=('id',how='left'), so in result I got

id name colour dropped
1   K    pink    1
2   K    red     1
3   L    blue    1
4   M    red    NaN
5   K    pink    1
6   K    pink   NaN

where column dropped came from the dataframe with dropes(it's a flag) In result I want to get dataframe like this

name colour launches  dropped
K    pink    3          2
K    red     1          1
L    blue    1          1
M    red     1          0

I tried the following

df = d.groupby(['name','colour']).count()

CodePudding user response:

You can use a named aggregation to do the count and sum separately :

(df.groupby(['name', 'colour'])
   .agg(launches = ('dropped', 'size'), 
        dropped = ('dropped', 'sum'))
   .reset_index()
)

             
name colour    launches  dropped
K    pink           3      2.0
     red            1      1.0
L    blue           1      1.0
M    red            1      0.0

Another option, thanks to @HenryEcker, is to use a count, instead of a sum, since count ignores nulls, and returns integers :

(df.groupby(['name', 'colour'])
   .agg(launches=('dropped', 'size'), 
        dropped=('dropped', 'count'))
   .reset_index()
)
Out[135]:
             
name colour  launches  dropped
K    pink           3        2
     red            1        1
L    blue           1        1
M    red            1        0
  • Related