Home > database >  r - Filtering rows by consecutive overlapping time intervals
r - Filtering rows by consecutive overlapping time intervals

Time:12-04

I got the following dataframe:

   Sensor Event          Start.Time            End.Time
1       1 Alert   05/01/2022 03:50:00 05/01/2022 04:10:00
2       1 Alarm   05/01/2022 03:55:00 05/01/2022 04:03:00
3       2 Alert   05/01/2022 03:56:00 05/01/2022 04:00:00
4       2 Alarm   06/02/2022 12:31:00 06/02/2022 12:54:00
5       1 Alert   14/01/2022 22:44:00 15/01/2022 00:40:00
6       1 Alarm   14/01/2022 22:36:00 15/01/2022 00:15:00
7       2 Alert   06/02/2022 12:25:00 06/02/2022 13:01:00
8       4 Alarm   05/03/2022 15:00:00 05/03/2022 15:30:00
9       3 Alert   07/04/2022 16:25:00 07/04/2022 16:38:00
10      3 Alarm   07/04/2022 16:27:00 07/04/2022 16:40:00
11      4 Alert   08/05/2022 19:00:00 08/05/2022 19:33:00
12      4 Alarm   08/05/2022 19:12:00 08/05/2022 19:21:00
13      1 Alarm   14/01/2022 22:46:00 15/01/2022 00:12:00
14      2 Alert   11/06/2022 19:00:00 11/06/2022 20:34:00
15      2 Alarm   11/06/2022 19:21:00 11/06/2022 20:12:00

from this code:

st <- c(20220105035000,20220105035500,20220105035600,20220206123100,20220114224400,20220114223600,20220206122500,
        20220305150000,20220407162500,20220407162700,20220508190000,20220508191200,20220114224600,20220611190000,
        20220611192100)

et <-c(20220105041000,20220105040300,20220105040000,20220206125400,20220115004000,20220115001500,
20220206130100,20220305153000,2020407163800,20220407164000,20220508193300,
20220508192100,20220115001200,20220611203400,20220611201200)


df <- data.frame(Sensor=c(1,1,2,2,1,1,2,4,3,3,4,4,1,2,2),
                 Event=c("Alert","Alarm","Alert","Alarm","Alert","Alarm","Alert","Alarm",
                         "Alert","Alarm","Alert","Alarm","Alarm","Alert","Alarm"),
                 Start.time=c(ymd_hms(st)),
                 End.time=c(ymd_hms(et)))

I need to drop all the "Alert rows" whose time intervals include their chronological consecutive (previous/following) "Alarm rows" time intervals. This check must be performed grouping by Sensor.

In other words the outcome should be something like this:

   Sensor Event          Start.Time            End.Time
2       1 Alarm   05/01/2022 03:55:00 05/01/2022 04:03:00
3       2 Alert   05/01/2022 03:56:00 05/01/2022 04:00:00
4       2 Alarm   06/02/2022 12:31:00 06/02/2022 12:54:00
5       1 Alert   14/01/2022 22:44:00 15/01/2022 00:40:00
6       1 Alarm   14/01/2022 22:36:00 15/01/2022 00:15:00
8       4 Alarm   05/03/2022 15:00:00 05/03/2022 15:30:00
9       3 Alert   07/04/2022 16:25:00 07/04/2022 16:38:00
10      3 Alarm   07/04/2022 16:27:00 07/04/2022 16:40:00
11      4 Alert   08/05/2022 19:00:00 08/05/2022 19:33:00
13      1 Alarm   14/01/2022 22:46:00 15/01/2022 00:12:00
15      2 Alarm   11/06/2022 19:21:00 11/06/2022 20:12:00

Thank you in advance!

CodePudding user response:

I did not achieved your expected output, but got a partial result that might help you

library(dplyr)

df %>% 
  group_by(Sensor) %>% 
  filter(
    !(Event == "Alert" & (End.time > lead(Start.time) ) )
  )

# A tibble: 11 x 4
# Groups:   Sensor [4]
   Sensor Event Start.time          End.time           
    <dbl> <chr> <dttm>              <dttm>             
 1      1 Alarm 2022-01-05 03:55:00 2022-01-05 04:03:00
 2      2 Alert 2022-01-05 03:56:00 2022-01-05 04:00:00
 3      2 Alarm 2022-02-06 12:31:00 2022-02-06 12:54:00
 4      1 Alarm 2022-01-14 22:36:00 2022-01-15 00:15:00
 5      2 Alert 2022-02-06 12:25:00 2022-02-06 13:01:00
 6      4 Alarm 2022-03-05 15:00:00 2022-03-05 15:30:00
 7      3 Alert 2022-04-07 16:25:00 0202-04-07 16:38:00
 8      3 Alarm 2022-04-07 16:27:00 2022-04-07 16:40:00
 9      4 Alarm 2022-05-08 19:12:00 2022-05-08 19:21:00
10      1 Alarm 2022-01-14 22:46:00 2022-01-15 00:12:00
11      2 Alarm 2022-06-11 19:21:00 2022-06-11 20:12:00
  • Related