Home > Net >  pandas dataframe groupby apply multi columns and get count
pandas dataframe groupby apply multi columns and get count

Time:04-28

I have a excel like this:

year a b
2021 12 23
2021 31 0
2021 15 21
2021 14 0
2022 32 0
2022 24 15
2022 28 29
2022 33 0

I wanna get count of condition: a>=30 and b==0 group by year the final output like this:

2021 1

2022 2

I wanna use pandas dataframe to implement this, can anyone help? I'm quite new to python

CodePudding user response:

For count matched rows chain both conditions by & for bitwise AND and aggregate sum, Trues are processing like 1 and False like 0:

df1 = ((df.a>=30) & (df.b==0)).astype(int)
           .groupby(df['year']).sum().reset_index(name='count')
print (df1)
   year  count
0  2021      1
1  2022      2

Similar idea with helper column:

df1 = (df.assign(count = ((df.a>=30) & (df.b==0)).astype(int))
         .groupby('year', as_index=False)['count']
         .sum())
  • Related