Home > Net >  How to filter dataset by number of rows of specific group?
How to filter dataset by number of rows of specific group?

Time:04-28

I have a dataset:

id     value
a1      14
a1      2
a1      34
a1      11
a1      78
b1      11
b1      9
b1      6

I want to filter that dataset by number if rows for each group, to make it no higher than 4. So desired output will be:

id     value
a1      14
a1      2
a1      34
a1      11
b1      11
b1      9
b1      6

How to do that?

CodePudding user response:

You can use groupby.head:

out = df.groupby('id').head(4)

If you have pandas >=1.4.0, then you can use groupby.nth with slicing as well:

out = df.groupby('id').nth[:4]

Output

   id  value
0  a1     14
1  a1      2
2  a1     34
3  a1     11
5  b1     11
6  b1      9
7  b1      6
  • Related