I am trying to convert time in HH:MM:SS to date as YYYY-MM-DD in R accounting for midnight.
Time spans from morning in day 1 to morning in day 2, but I wanted to create a new column with a specific date in which time after midnight would indicate a new day. Here's an example:
Current data:
structure(list(ID = c("ID_002", "ID_002", "ID_002", "ID_002",
"ID_002", "ID_002", "ID_002", "ID_002", "ID_002", "ID_002", "ID_002",
"ID_002", "ID_002", "ID_002", "ID_002", "ID_002", "ID_002", "ID_002",
"ID_002", "ID_002", "ID_002", "ID_002", "ID_002", "ID_002", "ID_002",
"ID_002", "ID_002", "ID_002", "ID_002"), Time = c("05:01:00",
"06:01:00", "07:01:00", "08:01:00", "09:01:00", "10:01:00", "11:01:00",
"12:01:00", "13:01:00", "14:01:00", "15:01:00", "16:01:00", "17:01:00",
"18:01:00", "19:01:00", "20:01:00", "21:01:00", "22:01:00", "23:01:00",
"00:01:00", "01:01:00", "02:01:00", "03:01:00", "04:01:00", "05:01:00",
"06:01:00", "07:01:00", "08:01:00", "09:01:00")), row.names = c(NA,
29L), class = "data.frame")
Desired output:
ID Time Date
ID_001 08:01:00 2021-01-20
ID_001 10:01:00 2021-01-20
ID_001 12:01:00 2021-01-20
ID_001 14:01:00 2021-01-20
ID_001 16:01:00 2021-01-20
ID_001 18:01:00 2021-01-20
ID_001 20:01:00 2021-01-20
ID_001 22:01:00 2021-01-20
ID_001 00:01:00 2021-01-21
ID_001 02:01:00 2021-01-21
ID_001 04:01:00 2021-01-21
ID_001 06:01:00 2021-01-21
ID_001 08:01:00 2021-01-21
Thanks for your help!
CodePudding user response:
A combination of dplyr and lubridate is a possible way to get what you want.
First determine if there is a new day by checking if the difference in lagged times is negative. If so add a day to the first day. Then use a cumsum
to add all the days to the start date.
library(dplyr)
library(lubridate)
first_date <- ymd("2021-01-20")
df1 %>%
mutate(add_a_day = if_else(hms(Time) - lag(hms(Time), default = hms("00:00:00")) < 0, 1, 0),
Date = first_date cumsum(add_a_day)
) %>%
select(-add_a_day)
ID Time Date
1 ID_002 05:01:00 2021-01-20
2 ID_002 06:01:00 2021-01-20
3 ID_002 07:01:00 2021-01-20
4 ID_002 08:01:00 2021-01-20
5 ID_002 09:01:00 2021-01-20
6 ID_002 10:01:00 2021-01-20
7 ID_002 11:01:00 2021-01-20
8 ID_002 12:01:00 2021-01-20
9 ID_002 13:01:00 2021-01-20
10 ID_002 14:01:00 2021-01-20
11 ID_002 15:01:00 2021-01-20
12 ID_002 16:01:00 2021-01-20
13 ID_002 17:01:00 2021-01-20
14 ID_002 18:01:00 2021-01-20
15 ID_002 19:01:00 2021-01-20
16 ID_002 20:01:00 2021-01-20
17 ID_002 21:01:00 2021-01-20
18 ID_002 22:01:00 2021-01-20
19 ID_002 23:01:00 2021-01-20
20 ID_002 00:01:00 2021-01-21
21 ID_002 01:01:00 2021-01-21
22 ID_002 02:01:00 2021-01-21
23 ID_002 03:01:00 2021-01-21
24 ID_002 04:01:00 2021-01-21
25 ID_002 05:01:00 2021-01-21
26 ID_002 06:01:00 2021-01-21
27 ID_002 07:01:00 2021-01-21
28 ID_002 08:01:00 2021-01-21
29 ID_002 09:01:00 2021-01-21
CodePudding user response:
Will this do?
library(tidyverse)
starting_date <- as.Date('2021-01-20')
library(lubridate)
df %>%
mutate(Date = starting_date cumsum(lag(hms(Time), default = hms('00:00:01')) > hms(Time) ))
#> ID Time Date
#> 1 ID_002 05:01:00 2021-01-20
#> 2 ID_002 06:01:00 2021-01-20
#> 3 ID_002 07:01:00 2021-01-20
#> 4 ID_002 08:01:00 2021-01-20
#> 5 ID_002 09:01:00 2021-01-20
#> 6 ID_002 10:01:00 2021-01-20
#> 7 ID_002 11:01:00 2021-01-20
#> 8 ID_002 12:01:00 2021-01-20
#> 9 ID_002 13:01:00 2021-01-20
#> 10 ID_002 14:01:00 2021-01-20
#> 11 ID_002 15:01:00 2021-01-20
#> 12 ID_002 16:01:00 2021-01-20
#> 13 ID_002 17:01:00 2021-01-20
#> 14 ID_002 18:01:00 2021-01-20
#> 15 ID_002 19:01:00 2021-01-20
#> 16 ID_002 20:01:00 2021-01-20
#> 17 ID_002 21:01:00 2021-01-20
#> 18 ID_002 22:01:00 2021-01-20
#> 19 ID_002 23:01:00 2021-01-20
#> 20 ID_002 00:01:00 2021-01-21
#> 21 ID_002 01:01:00 2021-01-21
#> 22 ID_002 02:01:00 2021-01-21
#> 23 ID_002 03:01:00 2021-01-21
#> 24 ID_002 04:01:00 2021-01-21
#> 25 ID_002 05:01:00 2021-01-21
#> 26 ID_002 06:01:00 2021-01-21
#> 27 ID_002 07:01:00 2021-01-21
#> 28 ID_002 08:01:00 2021-01-21
#> 29 ID_002 09:01:00 2021-01-21
Created on 2021-11-27 by the reprex package (v2.0.0)