I have to combine a lot of analyst spreadsheets from excel and it seems like every week they find a new way to infuriate me. Here's my latest problem:
data <- tibble(date_column = c(44673, 44674, "2022-04-25"))
# A tibble: 3 x 1
date_column
<chr>
1 44673
2 44674
3 2022-04-25
I've tried a variety of ways to test if it could be a number and then converting that to date one way and if not converting it another way, like this:
library(tidyverse)
library(lubridate)
data %>%
mutate(date_column = case_when(
!is.na(as.numeric(date_column)) ~ as.Date(date_column, origin = "1899-12-30"),
is.na(as.numeric(date_column)) ~ parse_date_time(date_column, orders = c("mdY","Ymd T", "Ymd"))))
# which throws a
Error: Problem with `mutate()` column `date_column`.
i `date_column = case_when(...)`.
x character string is not in a standard unambiguous format
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
Problem with `mutate()` column `date_column`.
i `date_column = case_when(...)`.
i NAs introduced by coercion
I feel like this must be a pretty common problem that has been solved before. Help greatly appreciated.
CodePudding user response:
Try janitor convert-to-date
:
library(tidyverse)
library(janitor)
#>
#> Attaching package: 'janitor'
#> The following objects are masked from 'package:stats':
#>
#> chisq.test, fisher.test
tribble(~date,
"44673",
"44674",
"2022-04-25") |>
mutate(clean_date = convert_to_date(date))
#> # A tibble: 3 × 2
#> date clean_date
#> <chr> <date>
#> 1 44673 2022-04-22
#> 2 44674 2022-04-23
#> 3 2022-04-25 2022-04-25
Created on 2022-05-05 by the reprex package (v2.0.1)
CodePudding user response:
A possible solution, based on openxlsx::convertToDate
:
library(tidyverse)
library(openxlsx)
data <- tibble(date_column = c(44673, 44674, "2022-04-25"))
coalesce(suppressWarnings(convertToDate(data$date_column)) %>%
as.character(), data$date_column)
#> [1] "2022-04-22" "2022-04-23" "2022-04-25"