How to create timestamp sequence for multiple groups in R?-CodePudding

This is my sample data. I want to create sequence for each group as shown below .

dt1<-data.frame(group=c("a","b"),start_time=c("2022-06-9 12:56:00","2022-06-9 12:52:00"),
            end_time=c("2022-06-9 12:59:00","2022-06-9 12:56:00"))

> dt1
group         start_time           end_time
1     a 2022-06-9 12:56:00 2022-06-9 12:59:00
2     b 2022-06-9 12:52:00 2022-06-9 12:56:00

This is what I expect my data to look like

dt2<-data.frame(group=c("a","a","a","a","b","b","b","b","b"),
                     time_stamp=c("2022-06-9 12:56:00","2022-06-9 12:57:00","2022-06-9 12:58:00","2022-06-9 12:59:00",
                                  "2022-06-9 12:52:00","2022-06-9 12:53:00","2022-06-9 12:54:00","2022-06-9 12:55:00",
                                  "2022-06-9 12:56:00"))

> dt2
group         time_stamp
1     a 2022-06-9 12:56:00
2     a 2022-06-9 12:57:00
3     a 2022-06-9 12:58:00
4     a 2022-06-9 12:59:00
5     b 2022-06-9 12:52:00
6     b 2022-06-9 12:53:00
7     b 2022-06-9 12:54:00
8     b 2022-06-9 12:55:00
9     b 2022-06-9 12:56:00

This is what I tried to do after transforming variables into POXIct types

  dt2<-dt1%>%group_by(group)%>%
  mutate(time_stamp=seq(start_date,end_date,by = "1 min"))

  Error in `mutate()`:
  ! Problem while computing `time_stamp = seq(date1, date2, by = "1 min")`.
  x `time_stamp` must be size 1, not 72421.

CodePudding user response：

With data.table:

library(data.table)
setDT(dt1)[ , list(group = group, time_stamp = seq(as.POSIXct(start_time), as.POSIXct(end_time), by = "1 min")), by = 1:nrow(dt1)]

   nrow group          time_stamp
1:    1     a 2022-06-09 12:56:00
2:    1     a 2022-06-09 12:57:00
3:    1     a 2022-06-09 12:58:00
4:    1     a 2022-06-09 12:59:00
5:    2     b 2022-06-09 12:52:00
6:    2     b 2022-06-09 12:53:00
7:    2     b 2022-06-09 12:54:00
8:    2     b 2022-06-09 12:55:00
9:    2     b 2022-06-09 12:56:00

CodePudding user response：

You could pivot your data into a "long" format first, then use seq in complete to generate the sequence. Also, you would need to convert the "character" type of the column into "date time" (as.POSIXct).

library(tidyverse)

dt1 %>% pivot_longer(ends_with("time"), names_to = "temp", values_to = "time_stamp") %>% 
  select(-temp) %>% 
  mutate(time_stamp = as.POSIXct(time_stamp)) %>% 
  group_by(group) %>% 
  complete(time_stamp = seq(min(time_stamp), max(time_stamp), by = "min"))

# A tibble: 9 × 2
# Groups:   group [2]
  group time_stamp         
  <chr> <dttm>             
1 a     2022-06-09 12:56:00
2 a     2022-06-09 12:57:00
3 a     2022-06-09 12:58:00
4 a     2022-06-09 12:59:00
5 b     2022-06-09 12:52:00
6 b     2022-06-09 12:53:00
7 b     2022-06-09 12:54:00
8 b     2022-06-09 12:55:00
9 b     2022-06-09 12:56:00