Home > OS >  Join by time stamp within time period
Join by time stamp within time period

Time:06-22

I have two tables, which I would like to join by date. If an instance falls within a color period, I would like that instance to receive said color as a variable (see joined):

instance <- read.table(header=TRUE, text="
ID      UTC_DateTime            Value
1       2022-06-15 15:00:00     8
2       2022-06-15 15:05:00     3
3       2022-06-15 15:10:00     2
4       2022-06-15 15:15:00     4
")

periods <- read.table(header=TRUE, text="
COLOR           UTC_DateTime          
Start_Green     2022-06-15 14:56:22
End_Green       2022-06-15 15:03:08
Start_Red       2022-06-15 15:03:11
End_Red         2022-06-15 15:58:48
")

joined <- read.table(header=TRUE, text="
ID      UTC_DateTime            Value     Color
1       2022-06-15 15:00:00     8         Green
2       2022-06-15 15:05:00     3         Green
3       2022-06-15 15:10:00     2         Red
4       2022-06-15 15:15:00     4         Red
")

Also, if you could somehow incorporate a process which converts periods to periods_wide that would be great, see:

periods_wide <- read.table(header=TRUE, text="
COLOR           UTC_DateTime_Start      UTC_DateTime_End
Green           2022-06-15 14:56:22     2022-06-15 15:03:08
Red             2022-06-15 15:03:11     2022-06-15 15:58:48
")

Please be aware that there might be many more instances and also many more periods (of the same or other colors)

CodePudding user response:

A possible solution, based on powerjoin::power_inner_join and lubridate. Notice that read.table was not reading the data as intended and, therefore, I used read.csv:

library(tidyverse)
library(powerjoin)
library(lubridate)

instance <- read.csv(header=TRUE, text="
ID,      UTC_DateTime,            Value
1,       2022-06-15 15:00:00,     8
2,       2022-06-15 15:05:00,     3
3,       2022-06-15 15:10:00,     2
4,       2022-06-15 15:15:00,     4
", strip.white = T)

periods <- read.csv(header=TRUE, text="
COLOR,           UTC_DateTime          
Start_Green,     2022-06-15 14:56:22
End_Green,       2022-06-15 15:03:08
Start_Red,       2022-06-15 15:03:11
End_Red,         2022-06-15 15:58:48
", strip.white = T)

periods %>% 
  mutate(COLOR = str_remove(COLOR, "Start_|End_"), 
         name = rep(c("d1", "d2"), 2)) %>% 
  pivot_wider(COLOR, values_from = UTC_DateTime) %>% 
  power_inner_join(instance,., 
                   by = c(~ ymd_hms(.x$UTC_DateTime) <= ymd_hms(.y$d2),  
                          ~ ymd_hms(.x$UTC_DateTime) >= ymd_hms(.y$d1))) %>% 
  select(-(d1:d2))

#>   ID        UTC_DateTime Value COLOR
#> 1  1 2022-06-15 15:00:00     8 Green
#> 2  2 2022-06-15 15:05:00     3   Red
#> 3  3 2022-06-15 15:10:00     2   Red
#> 4  4 2022-06-15 15:15:00     4   Red
  • Related