Home > Blockchain >  Is there a way to deal with this date format in R?
Is there a way to deal with this date format in R?

Time:11-07

I have a data frame that has the date column as a char class. I've tried parsing as.Date but the amount of NAs is worrisome. The dates are are in the following formats: "2003-10-19", and "October 05, 2018"

date <- c("October 05, 2018", "2003-10-19")

as.Date(date) this is what I tried, but most of my results came back with NAs

CodePudding user response:

Here is an option:

date <- c("October 05, 2018", "2003-10-19", "10/9/95", "6 Oct.2010")

lubridate::parse_date_time(date, orders = c("mdy", "ymd", "dmy"))
#> [1] "2018-10-05 UTC" "2003-10-19 UTC" "1995-10-09 UTC" "2010-10-06 UTC"

CodePudding user response:

as.Date has a feature called tryFormats, it's not vectorized, but can be used with e.g. lapply.

date <- c("October 05, 2018", "2003-10-19", "02/04/20", "11/09/2002",
"14.05.2021", "Nov 1, 2022", "March 1, 2004")

lapply(date, as.Date, tryFormats=c("%Y-%m-%d", "%B %d, %Y", "%d/%m/%y",
                                   "%m/%d/%Y", "%d.%m.%Y", "%b %d, %Y"))
[[1]]
[1] "2018-10-05"

[[2]]
[1] "2003-10-19"

[[3]]
[1] "2020-04-02"

[[4]]
[1] "2020-09-11"

[[5]]
[1] "2021-05-14"

[[6]]
[1] "2022-11-01"

[[7]]
[1] "2004-03-01"
  • Related