Home > Software engineering >  Find segments in Pandas dataframe
Find segments in Pandas dataframe

Time:11-07

I want to find the start and end timestamp of a segment indicated by some boolean values. The dataframe look like this:

t
2021-06-19 21:29:38     True
2021-06-19 21:29:48     True
2021-06-19 21:29:58     True
2021-06-19 21:30:08    False
2021-06-19 21:30:18    False
2021-06-19 21:30:28    False
2021-06-19 21:30:38    False
2021-06-19 21:30:48    False
2021-06-19 21:30:58     True
2021-06-19 21:31:08     True
2021-06-19 21:31:18     True
2021-06-19 21:31:28     True
2021-06-19 21:31:38     True
2021-06-19 21:31:48     True
2021-06-19 21:31:58     True
2021-06-19 21:32:08     True
2021-06-19 21:32:18     True
2021-06-19 21:32:28     True
2021-06-19 21:32:38     True
2021-06-19 21:32:48     True
Name: AT, dtype: bool

Now I need to extract two segments. The first one is from 21:29:38 to 21:29:58 and the second is from 21:30:58 to 21:32:48. Is there any way I can do this? I tried filtering for the True values but then I miss the time frame I don't want to include.

CodePudding user response:

Use cumsum() on the negate condition:

for group, data in s[s].groupby((~s).cumsum()):
    print(data)

Output:

t
2021-06-19 21:29:38    True
2021-06-19 21:29:48    True
2021-06-19 21:29:58    True
Name: AT, dtype: bool
t
2021-06-19 21:30:58    True
2021-06-19 21:31:08    True
2021-06-19 21:31:18    True
2021-06-19 21:31:28    True
2021-06-19 21:31:38    True
2021-06-19 21:31:48    True
2021-06-19 21:31:58    True
2021-06-19 21:32:08    True
2021-06-19 21:32:18    True
2021-06-19 21:32:28    True
2021-06-19 21:32:38    True
2021-06-19 21:32:48    True
Name: AT, dtype: bool
  • Related