I have a a number of periods which I would like to split by 5 minute intervals (00:00, 00:05, 00:10, 00:15, ...), if they span across such 5 minute intervals.
How do I turn
periods_raw <- read.table(header=TRUE, text="
COLOR UTC_DateTime_Start UTC_DateTime_End
Green 2022-06-15 14:56:23 2022-06-15 15:03:11
Purple 2022-06-15 15:03:11 2022-06-15 15:14:48
Red 2022-06-15 16:27:11 2022-06-15 16:29:48
")
into
periods_split <- read.table(header=TRUE, text="
COLOR UTC_DateTime_Start UTC_DateTime_End
Green 2022-06-15 14:56:23 2022-06-15 15:00:00
Green 2022-06-15 15:00:00 2022-06-15 15:03:11
Purple 2022-06-15 15:03:11 2022-06-15 15:05:00
Purple 2022-06-15 15:05:00 2022-06-15 15:10:00
Purple 2022-06-15 15:10:00 2022-06-15 15:14:48
Red 2022-06-15 16:27:11 2022-06-15 16:29:48
")
CodePudding user response:
If the 'UTC_DateTime' columns are not Datetime
class (POSIXct
), convert (ymd_hms
from lubridate
), then loop over the columns 'start', 'end' (suffix), create a seq
uence by '5 minute', use floor_date
for the elements that are not the first or the last, create a tibble
by removing the last and first observation from the sequence to create the new 'Start', 'End' columns and unnest
the list
column
library(dplyr)
library(tidyr)
library(lubridate)
library(purrr)
periods_raw %>%
mutate(across(starts_with("UTC_DateTime"), ymd_hms)) %>%
mutate(new = map2(UTC_DateTime_Start, UTC_DateTime_End, ~ {
v1 <- c(seq(.x, .y, by = "5 min"), .y)
v1[-c(1, length(v1))] <- floor_date(v1[-c(1, length(v1))], "5 min")
tibble(UTC_DateTime_Start = v1[-length(v1)], UTC_DateTime_End = v1[-1])
}), .keep = "unused") %>%
unnest(new)
-output
# A tibble: 6 × 3
COLOR UTC_DateTime_Start UTC_DateTime_End
<chr> <dttm> <dttm>
1 Green 2022-06-15 14:56:23 2022-06-15 15:00:00
2 Green 2022-06-15 15:00:00 2022-06-15 15:03:11
3 Purple 2022-06-15 15:03:11 2022-06-15 15:05:00
4 Purple 2022-06-15 15:05:00 2022-06-15 15:10:00
5 Purple 2022-06-15 15:10:00 2022-06-15 15:14:48
6 Red 2022-06-15 16:27:11 2022-06-15 16:29:48
data
periods_raw <- structure(list(COLOR = c("Green", "Purple", "Red"), UTC_DateTime_Start = c("2022-06-15 14:56:23",
"2022-06-15 15:03:11", "2022-06-15 16:27:11"), UTC_DateTime_End = c("2022-06-15 15:03:11",
"2022-06-15 15:14:48", "2022-06-15 16:29:48")), class = "data.frame", row.names = c(NA,
-3L))