Groupby streak of numbers and a mask to groupby two rows after each group-CodePudding

This is my dataframe:

df = pd.DataFrame({'a': [20, 1, 55, 333, 444, 1, 2, 10], 'b': [20, 20, 21, 21, 21, 22, 22, 22]})

I want to group them by column b and two rows after each group. This is the output that I need:

I know that I need a mask. I have tried some of them but didn't work. This is one of my tries:

df.groupby(df.b.diff().cumsum().eq(1))

CodePudding user response：

You cannot achieve this just with a grouper as groups cannot overlap.

You can however use a groupby with side effects by referencing the original DataFrame:

(df.groupby('b')
   .apply(lambda g: df.loc[g.index[0]:g.index[-1] 2])
)

Output:

        a   b
b            
20 0   20  20
   1    1  20
   2   55  21
   3  333  21
21 2   55  21
   3  333  21
   4  444  21
   5    1  22
   6    2  22
22 5    1  22
   6    2  22
   7   10  22

As a loop:

for k, g in df.groupby('b'):
    print(k)
    print(df.loc[g.index[0]:g.index[-1] 2])

Output: