Home > Software engineering >  Groupby streak of numbers and a mask to groupby two rows after each group
Groupby streak of numbers and a mask to groupby two rows after each group

Time:01-17

This is my dataframe:

df = pd.DataFrame({'a': [20, 1, 55, 333, 444, 1, 2, 10], 'b': [20, 20, 21, 21, 21, 22, 22, 22]})

I want to group them by column b and two rows after each group. This is the output that I need:

    a   b
0  20   20
1   1   20
2   55  21
3  333  21

     a   b
2   55  21
3  333  21
4  444  21
5   1   22
6   2   22

    a   b
5   1   22
6   2   22
7  10   22

I know that I need a mask. I have tried some of them but didn't work. This is one of my tries:

df.groupby(df.b.diff().cumsum().eq(1))

CodePudding user response:

You cannot achieve this just with a grouper as groups cannot overlap.

You can however use a groupby with side effects by referencing the original DataFrame:

(df.groupby('b')
   .apply(lambda g: df.loc[g.index[0]:g.index[-1] 2])
)

Output:

        a   b
b            
20 0   20  20
   1    1  20
   2   55  21
   3  333  21
21 2   55  21
   3  333  21
   4  444  21
   5    1  22
   6    2  22
22 5    1  22
   6    2  22
   7   10  22

As a loop:

for k, g in df.groupby('b'):
    print(k)
    print(df.loc[g.index[0]:g.index[-1] 2])

Output:

20
     a   b
0   20  20
1    1  20
2   55  21
3  333  21
21
     a   b
2   55  21
3  333  21
4  444  21
5    1  22
6    2  22
22
    a   b
5   1  22
6   2  22
7  10  22
  • Related