I have the below df:
import pandas as pd
frames = [9,12,14,15,16,17,18,22,23,24,25,30]
counter = [0,0,0,0,1,1,1,0,0,1,1,0]
df = pd.DataFrame({'frames':frames, 'counter':counter})
print(df)
frames counter
0 9 0
1 12 0
2 14 0
3 15 0
4 16 1
5 17 1
6 18 1
7 22 0
8 23 0
9 24 1
10 25 1
11 30 0
without using a for loop, how can i get the (start, end) frames where the counter is 1(True)?
result:
[(16,18), (24,25)]
CodePudding user response:
You can use ne(1)
cumsum
combined with loc
to group the 1s together. Then groupby.agg
to get the first
and last
value per group.
Either as a Series:
m = df['counter'].ne(1)
out = (df.loc[~m, 'frames']
.groupby(m.cumsum())
.agg(['first', 'last'])
)
output:
first last
counter
4 16 18
6 24 25
Or as a list of tuples with aggregation as tuples, the conversion to_list
:
lst = (df.loc[~m, 'frames']
.groupby(m.cumsum())
.agg(lambda x: (x.iloc[0], x.iloc[-1]))
.to_list()
)
output:
[(16, 18), (24, 25)]
CodePudding user response:
First create mask by compare counter
, aggregate consecutive True
s by Series.cumsum
with Series.shift
with GroupBy.first
GroupBy.last
and last convert to tuples:
m = df.counter.eq(1)
L = (df[m].groupby((m & ~m.shift(fill_value=False)).cumsum())['frames']
.agg(['first','last'])
.agg(tuple, axis=1)
.tolist())
print (L)
[(16, 18), (24, 25)]