Home > Back-end >  How to Filter Datetime by 'N' minute intervals in R?
How to Filter Datetime by 'N' minute intervals in R?

Time:08-19

I have a datetime column that I'd like to filter down to only contain 15 minute intervals with the first row's datetime being the baseline. Observe:

        order_dates  
1     2022-08-14 12:15:10
2     2022-08-14 12:15:11
3     2022-08-14 12:15:13 
4     2022-08-14 12:20:10
5     2022-08-14 12:20:16
6     2022-08-14 12:20:14
7     2022-08-14 12:25:19
8     2022-08-14 12:25:12      
9     2022-08-14 12:25:20 
10     2022-08-14 12:30:23
11     2022-08-14 12:30:31      
12     2022-08-14 12:30:34 
13     2022-08-14 12:40:32
14     2022-08-14 12:40:52  
15     2022-08-14 12:40:51
16     2022-08-14 12:45:40
17     2022-08-14 12:45:45
18     2022-08-14 12:45:23    

I would like the final dataset to only contain rows that contain 15 minute time-intervals with 12:15 as the baseline. So, I would like my final dataset to only contain rows 1,2,3,10,11,12,16,17,18. I would also like the final times to have the seconds column removed as well. Also, the order_dates column is of the POSIXct class

CodePudding user response:

Perhaps we extract the minutes part and create a logical expression

library(dplyr)
library(lubridate)
df1 %>% 
   mutate(order_dates = ymd_hms(order_dates)) %>%
   filter(! minute(order_dates)% )

-output

     order_dates
1  2022-08-14 12:15:10
2  2022-08-14 12:15:11
3  2022-08-14 12:15:13
10 2022-08-14 12:30:23
11 2022-08-14 12:30:31
12 2022-08-14 12:30:34
16 2022-08-14 12:45:40
17 2022-08-14 12:45:45
18 2022-08-14 12:45:23

data

df1 <- structure(list(order_dates = c("2022-08-14 12:15:10", "2022-08-14 12:15:11", 
"2022-08-14 12:15:13", "2022-08-14 12:20:10", "2022-08-14 12:20:16", 
"2022-08-14 12:20:14", "2022-08-14 12:25:19", "2022-08-14 12:25:12", 
"2022-08-14 12:25:20", "2022-08-14 12:30:23", "2022-08-14 12:30:31", 
"2022-08-14 12:30:34", "2022-08-14 12:40:32", "2022-08-14 12:40:52", 
"2022-08-14 12:40:51", "2022-08-14 12:45:40", "2022-08-14 12:45:45", 
"2022-08-14 12:45:23")), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18"))

CodePudding user response:

Using dplyr and lubridate:

library(dplyr)
library(lubridate)

df %>% 
  mutate(order_dates = format(order_dates ,format = "%Y-%m-%d %H:%M"),
         mins = lubridate::minute(order_dates)) %>% 
  filter(mins %in% c(15, 30, 45)) %>% 
  select(-mins)

This gives:

# A tibble: 9 × 1
  order_dates     
  <chr>           
1 2022-08-14 12:15
2 2022-08-14 12:15
3 2022-08-14 12:15
4 2022-08-14 12:30
5 2022-08-14 12:30
6 2022-08-14 12:30
7 2022-08-14 12:45
8 2022-08-14 12:45
9 2022-08-14 12:45

Data:

structure(list(order_dates = structure(c(1660479310, 1660479311, 
1660479313, 1660479610, 1660479616, 1660479614, 1660479919, 1660479912, 
1660479920, 1660480223, 1660480231, 1660480234, 1660480832, 1660480852, 
1660480851, 1660481140, 1660481145, 1660481123), class = c("POSIXct", 
"POSIXt"), tzone = "UTC")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -18L))

CodePudding user response:

To allow this to work for any initial time, and to produce the formatting you desire, you could do:

library(lubridate)
library(tidyverse)

df %>%
  mutate(order_dates = floor_date(order_dates, unit = 'min'),
         steps = as.numeric(order_dates - order_dates[1])/60) %>%
  filter(steps %% 15 == 0) %>%
  select(-steps) %>%
  mutate(order_dates = substr(order_dates, 1, 16))
#>        order_dates
#> 1 2022-08-14 12:15
#> 2 2022-08-14 12:15
#> 3 2022-08-14 12:15
#> 4 2022-08-14 12:30
#> 5 2022-08-14 12:30
#> 6 2022-08-14 12:30
#> 7 2022-08-14 12:45
#> 8 2022-08-14 12:45
#> 9 2022-08-14 12:45
  • Related