Home > Back-end >  pandas how to filter rows by certain conditions after groupby?
pandas how to filter rows by certain conditions after groupby?

Time:12-13

I'd like to filter the rows with a difference less than 20 after groupby.

Here is my original table

A     B     Value
1     1       0
1     2       10
1     2       20
1     2       25
2     1       0
2     1       15
2     1       100

After 'df.groupby(['A', 'B'])', it would be

A     B     Value
1     1       0           
1     2       10
      2       20
      2       25
2     1       0
      1       15
      1       100

I would like to filter those rows with a difference less than 20.

For example,

Group[A=1][B=1]row1 got no previous row, so filter out.

Group[A=1][B=2]row1 got no previous row, so filter out.

Group[A=1][B=2]row2 got a difference of 10, so keep it.

Group[A=2][B=1]row3 got a difference of 100, so filter out.

The results I expect:

A     B     Value
1     2       20
1     2       25
2     1       15

Here is what I tried:

df.groupby(['A', 'B']).filter(lambda x : (x['Value'] - x['Value'].shift(1) < 20).any())

But it does not work.

Any advice would be greatly appreciated

CodePudding user response:

Doing diff

out = df[df.groupby(['A','B']).Value.diff().lt(20)]
Out[398]: 
   A  B  Value
2  1  2     20
3  1  2     25
5  2  1     15

CodePudding user response:

cond1 = df.groupby(['A', 'B'])['Value'].transform(lambda x: x.diff(1).lt(20))
df[cond1]

df[cond1]

    A   B   Value
2   1   2   20
3   1   2   25
5   2   1   15
  • Related