Home > OS >  Pandas groupby all rows between boolean column values
Pandas groupby all rows between boolean column values

Time:11-22

Looking for the Pandas syntax for grouping rows between column boolean values.

DataTable

distance time new
0 2020-10-11 1:30:15 true
5 2020-10-11 1:31:15 false
10 2020-10-11 1:32:15 false
1 2020-10-11 2:30:15 true
2 2020-10-11 2:31:15 false
30 2020-10-11 2:32:15 false
31 2020-10-11 2:33:15 false
0 2020-10-12 1:31:15 true
5 2020-10-12 1:32:15 false

Question

Looking to group all the rows that are a part of each activity. A new activity is determined by the true value in the new column. Based on the data above there should be 3 groups. Each group needs to start with the first row with True and end with the last False before the next True. How can this be achieved using groupby?

Code

Here is the ability to use groupby and group by the date, but like I said above, I'm looking to change this to group by the "new" column group rows into activities. I'm planning to use in a for loop like this though.

for idx, day in df.groupby(df.timestamp.dt.date):
    print(idx)
    print(day)

CodePudding user response:

Just group by df['new'].cumsum():

for idx, day in df.groupby(df['new'].cumsum()):
    print('-----')
    print(idx)
    print(day)

Output:

-----
1
   distance                time    new
0         0  2020-10-11 1:30:15   True
1         5  2020-10-11 1:31:15  False
2        10  2020-10-11 1:32:15  False
-----
2
   distance                time    new
3         1  2020-10-11 2:30:15   True
4         2  2020-10-11 2:31:15  False
5        30  2020-10-11 2:32:15  False
6        31  2020-10-11 2:33:15  False
-----
3
   distance                time    new
7         0  2020-10-12 1:31:15   True
8         5  2020-10-12 1:32:15  False
  • Related