Home > Net >  pandas get (start, end) positions of True values based on another column
pandas get (start, end) positions of True values based on another column

Time:09-20

I have the below df:

import pandas as pd
frames = [9,12,14,15,16,17,18,22,23,24,25,30]
counter = [0,0,0,0,1,1,1,0,0,1,1,0]
df = pd.DataFrame({'frames':frames, 'counter':counter})
print(df)

    frames  counter
0        9        0
1       12        0
2       14        0
3       15        0
4       16        1
5       17        1
6       18        1
7       22        0
8       23        0
9       24        1
10      25        1
11      30        0

without using a for loop, how can i get the (start, end) frames where the counter is 1(True)?

result:

[(16,18), (24,25)]

CodePudding user response:

You can use ne(1) cumsum combined with loc to group the 1s together. Then groupby.agg to get the first and last value per group.

Either as a Series:

m = df['counter'].ne(1)

out = (df.loc[~m, 'frames']
      .groupby(m.cumsum())
      .agg(['first', 'last'])
      )

output:

         first  last
counter             
4           16    18
6           24    25

Or as a list of tuples with aggregation as tuples, the conversion to_list:

lst = (df.loc[~m, 'frames']
      .groupby(m.cumsum())
      .agg(lambda x: (x.iloc[0], x.iloc[-1]))
      .to_list()
      )

output:

[(16, 18), (24, 25)]

CodePudding user response:

First create mask by compare counter, aggregate consecutive Trues by Series.cumsum with Series.shift with GroupBy.first GroupBy.last and last convert to tuples:

m = df.counter.eq(1)

L = (df[m].groupby((m & ~m.shift(fill_value=False)).cumsum())['frames']
          .agg(['first','last'])
          .agg(tuple, axis=1)
          .tolist())
print (L)
[(16, 18), (24, 25)]
  • Related