I have a large dataframe with multiple date columns. I did like to use str.contains to identify all those date columns and change the format and retain the columns in original dataframe. Here is a sample of the dataset:
dat <- data.frame(
SSN = c(204,401,101,666,777),
date_today=c("1914-01-01","2022-03-12","2021-07-09","1914-01-01","2022-04-05"),
date_adm = c("2020-03-11","2022-03-12","NA","2021-04-07","2022-04-05")
)
I have tried this code but looks like its very wrong
Data %>% mutate(select(contains("date")), as.Date, format="%d-%m-%Y")
End result is filter columns containing date then change format while retaining those date columns within the origina dataframe.
CodePudding user response:
The code uses select
inside, instead, it should be across
. In addition, the format
should be "%Y-%m-%d"
library(dplyr)
dat %>%
mutate(across(contains("date"), as.Date, format = "%Y-%m-%d"))
-output
SSN date_today date_adm
1 204 1914-01-01 2020-03-11
2 401 2022-03-12 2022-03-12
3 101 2021-07-09 <NA>
4 666 1914-01-01 2021-04-07
5 777 2022-04-05 2022-04-05
CodePudding user response:
Use across
instead of select
. Also you have to pass the right format, i.e. the year comes first in your data (but I wasn't sure about the order of the month and day in your data, so perhaps you have to change that.)
library(dplyr)
dat %>%
mutate(across(contains("date"), as.Date, format="%Y-%d-%m"))
#> SSN date_today date_adm
#> 1 204 1914-01-01 2020-11-03
#> 2 401 2022-12-03 2022-12-03
#> 3 101 2021-09-07 <NA>
#> 4 666 1914-01-01 2021-07-04
#> 5 777 2022-05-04 2022-05-04