Home > Blockchain >  Change date format of multiple date columns
Change date format of multiple date columns

Time:11-03

I have a large dataframe with multiple date columns. I did like to use str.contains to identify all those date columns and change the format and retain the columns in original dataframe. Here is a sample of the dataset:

dat <- data.frame(
  SSN = c(204,401,101,666,777), 
  date_today=c("1914-01-01","2022-03-12","2021-07-09","1914-01-01","2022-04-05"), 
  date_adm = c("2020-03-11","2022-03-12","NA","2021-04-07","2022-04-05")
 
)

I have tried this code but looks like its very wrong

Data %>% mutate(select(contains("date")), as.Date, format="%d-%m-%Y")

End result is filter columns containing date then change format while retaining those date columns within the origina dataframe.

CodePudding user response:

The code uses select inside, instead, it should be across. In addition, the format should be "%Y-%m-%d"

library(dplyr)
dat %>%
   mutate(across(contains("date"), as.Date, format = "%Y-%m-%d"))

-output

SSN date_today   date_adm
1 204 1914-01-01 2020-03-11
2 401 2022-03-12 2022-03-12
3 101 2021-07-09       <NA>
4 666 1914-01-01 2021-04-07
5 777 2022-04-05 2022-04-05

CodePudding user response:

Use across instead of select. Also you have to pass the right format, i.e. the year comes first in your data (but I wasn't sure about the order of the month and day in your data, so perhaps you have to change that.)

library(dplyr)

dat %>% 
  mutate(across(contains("date"), as.Date, format="%Y-%d-%m"))
#>   SSN date_today   date_adm
#> 1 204 1914-01-01 2020-11-03
#> 2 401 2022-12-03 2022-12-03
#> 3 101 2021-09-07       <NA>
#> 4 666 1914-01-01 2021-07-04
#> 5 777 2022-05-04 2022-05-04
  • Related