Home > Mobile >  Append unique group ids from pandas rolling window
Append unique group ids from pandas rolling window

Time:09-14

I have a dataframe with timestamps, which I have used DataFrame.rolling() to find occurrences within a small time window (10 seconds), following the accepted answer to this question.

However, I would like to add a new column with unique group ids for all the rows which occur within the same time window.

After running the accepted answer, my dataframe looks like this:

timestamp                user_id    count
2021-01-08 10:00:01      1          1
2021-01-08 10:00:02      2          2
2021-01-08 10:00:03      3          3
2021-01-08 10:00:09      1          4
2021-01-08 11:00:01      1          1
2021-01-08 11:00:02      7          2
2021-01-08 11:00:11      3          1

My desired output would be something like this:

timestamp                user_id    count    window_group
2021-01-08 10:00:01      1          1        1
2021-01-08 10:00:02      2          2        1
2021-01-08 10:00:03      3          3        1
2021-01-08 10:00:09      1          4        1
2021-01-08 11:00:01      1          1        2
2021-01-08 11:00:02      7          2        2
2021-01-08 11:00:11      3          1        3

CodePudding user response:

You can identify the reset in number with diff and boolean not (~), then cumsum:

df['window_group'] = (~df['count'].diff().gt(0)).cumsum()

output:

             timestamp  user_id  count  window_group
0  2021-01-08 10:00:01        1      1             1
1  2021-01-08 10:00:02        2      2             1
2  2021-01-08 10:00:03        3      3             1
3  2021-01-08 10:00:09        1      4             1
4  2021-01-08 11:00:01        1      1             2
5  2021-01-08 11:00:02        7      2             2
6  2021-01-08 11:00:11        3      1             3
  • Related