Home > Software engineering >  Check if any dates within an Interval are within any of the dates in another Interval
Check if any dates within an Interval are within any of the dates in another Interval

Time:11-26

I have 2 intervals of dates, and I want to see if any of the dates in interval_A are within interval_B. I am ideally looking for a dplyr solution.

The data

library(lubridate)

interval_A <- 
new("Interval", .Data = c(20822400, 10454400, 42508800, 18662400, 
12355200, 16243200, 10195200, 14774400, 37324800, 31276800, 27734400, 
62985600, 15724800, 32054400, 21427200), start = structure(c(94953600, 
131328000, 240451200, 294278400, 334454400, 449193600, 493344000, 
546739200, 575596800, 760320000, 930700800, 1088553600, 1481673600, 
1513123200, 1647388800), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), tzone = "UTC")


interval_B <- 
new("Interval", .Data = c(41904000, 15724800, 42163200, 20995200, 
21168000, 47347200, 5184000), start = structure(c(120960000, 
315532800, 362793600, 646790400, 983404800, 1196467200, 1580515200
), tzone = "UTC", class = c("POSIXct", "POSIXt")), tzone = "UTC")


 interval_A
 [1] 1973-01-04 UTC--1973-09-02 UTC 1974-03-01 UTC--1974-06-30 UTC 1977-08-15 UTC--1978-12-20 UTC
 [4] 1979-04-30 UTC--1979-12-02 UTC 1980-08-07 UTC--1980-12-28 UTC 1984-03-27 UTC--1984-10-01 UTC
 [7] 1985-08-20 UTC--1985-12-16 UTC 1987-04-30 UTC--1987-10-18 UTC 1988-03-29 UTC--1989-06-04 UTC
[10] 1994-02-04 UTC--1995-02-01 UTC 1999-06-30 UTC--2000-05-16 UTC 2004-06-30 UTC--2006-06-29 UTC
[13] 2016-12-14 UTC--2017-06-14 UTC 2017-12-13 UTC--2018-12-19 UTC 2022-03-16 UTC--2022-11-19 UTC


interval_B
[1] 1973-11-01 UTC--1975-03-01 UTC 1980-01-01 UTC--1980-07-01 UTC 1981-07-01 UTC--1982-11-01 UTC
[4] 1990-07-01 UTC--1991-03-01 UTC 2001-03-01 UTC--2001-11-01 UTC 2007-12-01 UTC--2009-06-01 UTC
[7] 2020-02-01 UTC--2020-04-01 UTC

I was hoping this was going to simple using the following code, but this throws an error:

as.list(interval_A) %within% as.list(interval_B)

Error in as.list(interval_A) %within% as.list(interval_B) : 
  No %within% method with signature a = list,  b = list

Another solution might be to expand out every date in interval_A and check it against interval_B, but I was hoping there might be an easier solution (and I am not sure if there is a simple way to convert interval_A into a vector of dates)

Thanks for any thoughts!

CodePudding user response:

Using sapply() to test each element from interval_A against all elements of interval_B:

library(lubridate)

A_in_B <- sapply(interval_A, \(x) any(x %within% interval_B))
A_in_B
# FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

interval_A[A_in_B]
# 1974-03-01 UTC--1974-06-30 UTC

Or, using tidyverse to create a dataframe with results:

library(lubridate)
library(tibble)
library(purrr)

tibble(
  interval_A,
  in_interval_B = map_lgl(interval_A, ~ any(.x %within% interval_B))
)
# A tibble: 15 × 2
   interval_A                     in_interval_B
   <Interval>                     <lgl>        
 1 1973-01-04 UTC--1973-09-02 UTC FALSE        
 2 1974-03-01 UTC--1974-06-30 UTC TRUE         
 3 1977-08-15 UTC--1978-12-20 UTC FALSE        
 4 1979-04-30 UTC--1979-12-02 UTC FALSE        
 5 1980-08-07 UTC--1980-12-28 UTC FALSE        
 6 1984-03-27 UTC--1984-10-01 UTC FALSE        
 7 1985-08-20 UTC--1985-12-16 UTC FALSE        
 8 1987-04-30 UTC--1987-10-18 UTC FALSE        
 9 1988-03-29 UTC--1989-06-04 UTC FALSE        
10 1994-02-04 UTC--1995-02-01 UTC FALSE        
11 1999-06-30 UTC--2000-05-16 UTC FALSE        
12 2004-06-30 UTC--2006-06-29 UTC FALSE        
13 2016-12-14 UTC--2017-06-14 UTC FALSE        
14 2017-12-13 UTC--2018-12-19 UTC FALSE        
15 2022-03-16 UTC--2022-11-19 UTC FALSE        
  • Related