Home > front end >  Mapping ymd_hms time into 15 minute time intervals
Mapping ymd_hms time into 15 minute time intervals

Time:12-01

I am looking for a data.table solution to the following question.

Suppose I have the following dataset:

library(data.table)
library(lubridate)

DT <- data.table(date = c("2019-01-01 00:03:04", 
                          "2019-01-01 00:07:03",
                          "2019-01-01 00:15:23",
                          "2019-01-01 00:16:28",
                          "2019-01-01 00:21:30"))
DT
         date
1: 2019-01-01 00:03:04
2: 2019-01-01 00:07:03
3: 2019-01-01 00:15:23
4: 2019-01-01 00:16:28
5: 2019-01-01 00:21:30


I am looking to map each date into 15 minute interval for each day. So the final dataset would look like:

   date                    date_15min_grp
1: 2019-01-01 00:03:04              1
2: 2019-01-01 00:07:03              1
3: 2019-01-01 00:15:23              2
4: 2019-01-01 00:16:28              2
5: 2019-01-01 00:21:30              2

The groups would be unique at the day level, so 2019-01-01 00:03:04 and 2019-01-02 00:03:04 would both return values of 1. My current solution is to extract the hour/minute/second component of each day. I would then add the values (after conversion into minutes), divide them by 60, and try to map each value to the nearest "group". But this seems to be slow and inelegant.

A data.table/lubridate would be greatly appreciated.

Thanks so much!

CodePudding user response:

Please find below a solution using data.table and lubridate

Reprex

  • Code
library(data.table)
library(lubridate)

DT[, date_15min_grp := fcase(minute(date) < 15, 1,
                             minute(date) < 30, 2,
                             minute(date) < 45, 3,
                             default = 4)][]
  • Output
#>                   date date_15min_grp
#> 1: 2019-01-01 00:03:04              1
#> 2: 2019-01-01 00:07:03              1
#> 3: 2019-01-01 00:15:23              2
#> 4: 2019-01-01 00:16:28              2
#> 5: 2019-01-01 00:21:30              2

Created on 2021-11-30 by the reprex package (v2.0.1)


AS A FOLLOW-UP TO YOUR COMMENT

  • Code
library(data.table)
library(lubridate)

DT[, date_15min_grp := fcase(minute(date) < 15, hour(date)*4   1,
                             minute(date) < 30, hour(date)*4   2,
                             minute(date) < 45, hour(date)*4   3,
                             minute(date) < 60, hour(date)*4   4)][]
  • Output
#>                   date date_15min_grp
#> 1: 2019-01-01 00:03:04              1
#> 2: 2019-01-01 00:07:03              1
#> 3: 2019-01-01 00:15:23              2
#> 4: 2019-01-01 00:16:28              2
#> 5: 2019-01-01 00:21:30              2

Created on 2021-12-01 by the reprex package (v2.0.1)

CodePudding user response:

You should see if the slider package works for you. It can both cut the data by rows, but also apply functions like the apply / purrr functionality.

library(tibble)
library(slider)
library(dplyr)

f <- data.frame(date = c(as.POSIXlt("2019-01-01 00:03:04"), 
         as.POSIXlt("2019-01-01 00:07:03"), as.POSIXlt("2019-01-01 00:15:23"),
         as.POSIXlt("2019-01-01 00:16:28"), as.POSIXlt("2019-01-01 00:21:30"), 
         as.POSIXlt("2019-01-01 00:22:03"), as.POSIXlt("2019-01-01 00:25:23"),
         as.POSIXlt("2019-01-01 00:36:28"), as.POSIXlt("2019-01-01 00:41:30"), 
         as.POSIXlt("2019-01-01 00:47:03"), as.POSIXlt("2019-01-01 00:48:23"),
         as.POSIXlt("2019-01-01 00:51:28"), as.POSIXlt("2019-01-01 00:51:30"), 
         as.POSIXlt("2019-01-01 00:57:03"), as.POSIXlt("2019-01-01 00:61:23"),
         as.POSIXlt("2019-01-01 00:66:28"))) %>% arrange(date)

g <- block(f, f$date, period = "minute", every=15)

CodePudding user response:

Two things:

  1. I think your 5th row should be group 2, since it is before 00:30:00 of the day. Please clarify if this is incorrect.

  2. You say it needs to be per-day, but your data only includes one day; I'll augment it to demonstrate its per-day calculations.

DT[,date := as.POSIXct(date)]
DT2 <- rbindlist(list(DT, DT[, date := date   86400]))
DT2
#                    date   grp
#                  <POSc> <int>
#  1: 2019-01-02 00:03:04     1
#  2: 2019-01-02 00:07:03     1
#  3: 2019-01-02 00:15:23     2
#  4: 2019-01-02 00:16:28     2
#  5: 2019-01-02 00:21:30     2
#  6: 2019-01-02 00:03:04     1
#  7: 2019-01-02 00:07:03     1
#  8: 2019-01-02 00:15:23     2
#  9: 2019-01-02 00:16:28     2
# 10: 2019-01-02 00:21:30     2

And the group-assignment:

DT2[, day := format(date, format = "%Y%m%d")
  ][, grp := findInterval(date, seq(lubridate::floor_date(min(date), unit = "hours"), max(date)   3600, by = "15 mins")), by = day][]
#                    date   grp      day
#                  <POSc> <int>   <char>
#  1: 2019-01-02 00:03:04     1 20190102
#  2: 2019-01-02 00:07:03     1 20190102
#  3: 2019-01-02 00:15:23     2 20190102
#  4: 2019-01-02 00:16:28     2 20190102
#  5: 2019-01-02 00:21:30     2 20190102
#  6: 2019-01-02 00:03:04     1 20190102
#  7: 2019-01-02 00:07:03     1 20190102
#  8: 2019-01-02 00:15:23     2 20190102
#  9: 2019-01-02 00:16:28     2 20190102
# 10: 2019-01-02 00:21:30     2 20190102
  • Related