I have the following data, which contains some date values as 5 digit character values. When I try to convert to date, the correct date changes to NA
value.
dt <- data.frame(id=c(1,1,1,1,1,1,2,2,2,2,2),
Registrationdate=c('2019-01-09','2019-01-09','2019-01-09','2019-01-09','2019-01-09',
'2019-01-09',"44105","44105","44105","44105","44105"))
Expected value
id Registrationdate
1 1 2019-01-09
2 1 2019-01-09
3 1 2019-01-09
4 1 2019-01-09
5 1 2019-01-09
6 1 2019-01-09
7 2 2020-10-01
8 2 2020-10-01
9 2 2020-10-01
10 2 2020-10-01
11 2 2020-10-01
I tried using
library(openxlsx)
dt$Registrationdate <- convertToDate(dt$Registrationdate, origin = "1900-01-01")
But I got
1 1 <NA>
2 1 <NA>
3 1 <NA>
4 1 <NA>
5 1 <NA>
6 1 <NA>
7 2 2020-10-01
8 2 2020-10-01
9 2 2020-10-01
10 2 2020-10-01
11 2 2020-10-01
CodePudding user response:
Here's one approach using a mix of dplyr
and base R:
library(dplyr, warn = FALSE)
dt |>
mutate(Registrationdate = if_else(grepl("-", Registrationdate),
as.Date(Registrationdate),
openxlsx::convertToDate(Registrationdate, origin = "1900-01-01")))
#> Warning in openxlsx::convertToDate(Registrationdate, origin = "1900-01-01"): NAs
#> introduced by coercion
#> id Registrationdate
#> 1 1 2019-01-09
#> 2 1 2019-01-09
#> 3 1 2019-01-09
#> 4 1 2019-01-09
#> 5 1 2019-01-09
#> 6 1 2019-01-09
#> 7 2 2020-10-01
#> 8 2 2020-10-01
#> 9 2 2020-10-01
#> 10 2 2020-10-01
#> 11 2 2020-10-01
Created on 2022-10-15 with reprex v2.0.2
CodePudding user response:
df1 <- dt %>%
mutate(`Registrationdate` = coalesce(as.character(janitor::excel_numeric_to_date(
as.numeric(`Registrationdate`))), `Registrationdate`))
id Registrationdate
1 1 2019-01-09
2 1 2019-01-09
3 1 2019-01-09
4 1 2019-01-09
5 1 2019-01-09
6 1 2019-01-09
7 2 2020-10-01
8 2 2020-10-01
9 2 2020-10-01
10 2 2020-10-01
11 2 2020-10-01