Home > database >  Date/Time formatting with variables containing day of the week
Date/Time formatting with variables containing day of the week

Time:06-08

I'm trying to format a date column that contains the day of the week. When I try to format it using my usual method my time is always 00:00. Here is an example:

a <- c(1:6)
b <- c("Tuesday, 05 December 2012 05:00","Tuesday, 05 December 2012 06:55","Tuesday, 05 December 2012 07:10",
       "Tuesday, 05 December 2012 10:23", "Tuesday, 05 December 2012 11:43","Tuesday, 05 December 2012 13:04")
c <-c("0","0","0","1","1","1")
df1 <- data.frame(a,b,c,stringsAsFactors = FALSE)

If I try to change the format using this code:

df1 <- df1 %>%
  mutate(DateTime = format(as.Date(b, format = "%A, %d %B %Y %H:%M"), "%d-%m-%Y %H:%M"))

I get a response where the hours and minutes are always 00:00. I believe this has something to with the date not being POSIXct/POSIXlt format, but I'm not sure.

  a                            Date c         DateTime
1 1 Tuesday, 05 December 2012 05:00 0 05-12-2012 00:00
2 2 Tuesday, 05 December 2012 06:55 0 05-12-2012 00:00
3 3 Tuesday, 05 December 2012 07:10 0 05-12-2012 00:00
4 4 Tuesday, 05 December 2012 10:23 1 05-12-2012 00:00
5 5 Tuesday, 05 December 2012 11:43 1 05-12-2012 00:00
6 6 Tuesday, 05 December 2012 13:04 1 05-12-2012 00:00

Thanks!

CodePudding user response:

could be a locale issue?

#store current locale, ans change to En
orig_locale <- Sys.getlocale("LC_TIME")
Sys.setlocale("LC_TIME", "English.UTF-8")
as.POSIXct(df1$b, format = "%A, %d %B %Y %H:%M")
#reset locale
Sys.setlocale("LC_TIME", orig_locale)

# [1] "2012-12-05 05:00:00 CET" "2012-12-05 06:55:00 CET" "2012-12-05 07:10:00 CET" "2012-12-05 10:23:00 CET"
# [5] "2012-12-05 11:43:00 CET" "2012-12-05 13:04:00 CET"

CodePudding user response:

Here another approach:

# data 
b <- c("Tuesday, 05 December 2012 05:00","Tuesday, 05 December 2012 06:55","Tuesday, 05 December 2012 07:10",
       "Tuesday, 05 December 2012 10:23", "Tuesday, 05 December 2012 11:43","Tuesday, 05 December 2012 13:04")

# remove everything before first digit
b <- sub("^\\D ", "", b)

# use ydm_hm() from {lubridate}
library(lubridate)
ydm_hm(b)
#> [1] "2005-12-20 05:00:00 UTC" "2005-12-20 06:55:00 UTC"
#> [3] "2005-12-20 07:10:00 UTC" "2005-12-20 10:23:00 UTC"
#> [5] "2005-12-20 11:43:00 UTC" "2005-12-20 13:04:00 UTC"

Created on 2022-06-07 by the reprex package (v2.0.1)

EDIT

To be in line with your style:

a <- c(1:6)
b <- c("Tuesday, 05 December 2012 05:00","Tuesday, 05 December 2012 06:55","Tuesday, 05 December 2012 07:10",
       "Tuesday, 05 December 2012 10:23", "Tuesday, 05 December 2012 11:43","Tuesday, 05 December 2012 13:04")
c <-c("0","0","0","1","1","1")

df1 <- data.frame(a, b, c, 
                  stringsAsFactors = FALSE)

library(lubridate)
library(dplyr)

df1 <- df1 |>
  mutate(Datetime = (ydm_hm(sub("^\\D ", "", b)))
         ) |>
  rename(Date = b)

head(df1)
#>   a                            Date c            Datetime
#> 1 1 Tuesday, 05 December 2012 05:00 0 2005-12-20 05:00:00
#> 2 2 Tuesday, 05 December 2012 06:55 0 2005-12-20 06:55:00
#> 3 3 Tuesday, 05 December 2012 07:10 0 2005-12-20 07:10:00
#> 4 4 Tuesday, 05 December 2012 10:23 1 2005-12-20 10:23:00
#> 5 5 Tuesday, 05 December 2012 11:43 1 2005-12-20 11:43:00
#> 6 6 Tuesday, 05 December 2012 13:04 1 2005-12-20 13:04:00

Created on 2022-06-07 by the reprex package (v2.0.1)

  • Related