Home > Software engineering >  R transform date to y-m-d in mismatched date column
R transform date to y-m-d in mismatched date column


I am working with this dataframe of id and date variables.

structure(list(id = c("1000", "1000", "1000", "1000", "1000", 
"1000", "1000", "1000", "1000", "1000", "1000", "1000", "1000", 
"1000", "1000", "1000", "1000", "1000"), Date = c("2022-01-18", 
"2022-01-18", "2022-01-18", "1/20/2022", "1/20/2022", "2022-02-25", 
"2022-03-04", "2022-03-12", "2022-03-15", "2022-03-21", "2022-03-21", 
"2022-03-21", "2022-03-21", "2022-03-28", "3/30/2022", "3/30/2022", 
"3/30/2022", "2022-04-07")), row.names = c(NA, -18L), class = c("tidytable", 
"data.table", "data.frame"))

One thing that is weird about this datset is the date column. Sometimes, it's in the m/d/y format and other times it's in the y-m-d format. I prefer the second format used in R.

Because the column is mismatched, I have trouble sorting the data frame so the two different date formats are separated out. Is there an operation I can use to make the date format y-m-d all the way through? Something (pseudo-code) like ifelse(date = m/d/y, (transform it to)y-m-d, (otherwise leave it the same as)y-m-d))?

CodePudding user response:

An alternative lubridate approach you could use:


df %>%
  mutate(Date = lubridate::parse_date_time(Date, orders = c("ymd", "mdy")))

CodePudding user response:

You can use dplyr::case_when and stringr::str_detect to look for "/" and apply lubridate::mdy if it is found, otherwise apply lubridate::ymd.

This will convert dates from character format to Date format.


data <- data %>% 
  mutate(newDate = case_when(str_detect(Date, "/") ~ mdy(Date), 
                             TRUE ~ ymd(Date)))

Note: this code generates warnings because it tries to apply both ymd and mdy to all values of Date - but this doesn't really matter, the end result is correct.


     id       Date    newDate
1  1000 2022-01-18 2022-01-18
2  1000 2022-01-18 2022-01-18
3  1000 2022-01-18 2022-01-18
4  1000  1/20/2022 2022-01-20
5  1000  1/20/2022 2022-01-20
6  1000 2022-02-25 2022-02-25
7  1000 2022-03-04 2022-03-04
8  1000 2022-03-12 2022-03-12
9  1000 2022-03-15 2022-03-15
10 1000 2022-03-21 2022-03-21
11 1000 2022-03-21 2022-03-21
12 1000 2022-03-21 2022-03-21
13 1000 2022-03-21 2022-03-21
14 1000 2022-03-28 2022-03-28
15 1000  3/30/2022 2022-03-30
16 1000  3/30/2022 2022-03-30
17 1000  3/30/2022 2022-03-30
18 1000 2022-04-07 2022-04-07
  •  Tags:  
  • r
  • Related