I have a data frame that has the date column as a char class. I've tried parsing as.Date but the amount of NAs is worrisome. The dates are are in the following formats: "2003-10-19", and "October 05, 2018"
date <- c("October 05, 2018", "2003-10-19")
as.Date(date)
this is what I tried, but most of my results came back with NAs
CodePudding user response:
Here is an option:
date <- c("October 05, 2018", "2003-10-19", "10/9/95", "6 Oct.2010")
lubridate::parse_date_time(date, orders = c("mdy", "ymd", "dmy"))
#> [1] "2018-10-05 UTC" "2003-10-19 UTC" "1995-10-09 UTC" "2010-10-06 UTC"
CodePudding user response:
as.Date
has a feature called tryFormats
, it's not vectorized, but can be used with e.g. lapply
.
date <- c("October 05, 2018", "2003-10-19", "02/04/20", "11/09/2002",
"14.05.2021", "Nov 1, 2022", "March 1, 2004")
lapply(date, as.Date, tryFormats=c("%Y-%m-%d", "%B %d, %Y", "%d/%m/%y",
"%m/%d/%Y", "%d.%m.%Y", "%b %d, %Y"))
[[1]]
[1] "2018-10-05"
[[2]]
[1] "2003-10-19"
[[3]]
[1] "2020-04-02"
[[4]]
[1] "2020-09-11"
[[5]]
[1] "2021-05-14"
[[6]]
[1] "2022-11-01"
[[7]]
[1] "2004-03-01"