I have a small dataset with less than 200 rows.
hiv <- data.frame(id=1:6, ENTERED.DATE=c("Mar.03.2021", "Oct.16.2020", "Oct.19.2020", "Mar.09.2021", "Mar.10.2021", "Nov.16.2020"))
The dates are not chronological but recorded haphazardly. The file is .xlsx . Changing the date column from character type to date type produces NAs
library(readxl)
hiv <- read_xlsx("hiv.xlsx", sheet = "Sheet1", col_names = TRUE, na = "NA")
#> New names:
#> * `` -> `...25`
names(hiv) <- make.names(names(hiv))
date <- as.Date(as.character(hiv$ENTERED.DATE), "%d%m%Y")
head(date)
#> [1] NA NA NA NA NA NA
I have read some of the solutions on stackoverflow but could not resolve the issue.
CodePudding user response:
You'll need to use the right date format for your data: %d%m%Y
is date month year, but your data has the format monthabb.day.year, which would require %h.%d.%Y
:
as.Date(c("Mar.03.2021", "Oct.16.2020", "Oct.19.2020", "Mar.09.2021", "Mar.10.2021", "Nov.16.2020"), "%h.%d.%Y")
Or in your data frame:
date <- as.Date(hiv$ENTERED.DATE, "%h.%d.%Y")
Output:
"2021-03-03" "2020-10-16" "2020-10-19" "2021-03-09" "2021-03-10" "2020-11-16"
CodePudding user response:
In the R Documentation for as.Date
they already describe this problem and provide a solution.
First you have to use the right format for your date: %b.%m.%Y
and then you have to set the C locale as described below:
hiv <- data.frame(id=1:6, ENTERED_DATE=c("Mar.03.2021", "Oct.16.2020", "Oct.19.2020", "Mar.09.2021", "Mar.10.2021", "Nov.16.2020"))
Sys.setlocale("LC_TIME","C")
#> [1] "C"
as.Date(hiv$ENTERED_DATE, "%b.%d.%Y")
#> [1] "2021-03-03" "2020-10-16" "2020-10-19" "2021-03-09" "2021-03-10"
#> [6] "2020-11-16"
Created on 2022-08-24 with reprex v2.0.2