I am working with this dataframe of id
and date
variables.
structure(list(id = c("1000", "1000", "1000", "1000", "1000",
"1000", "1000", "1000", "1000", "1000", "1000", "1000", "1000",
"1000", "1000", "1000", "1000", "1000"), Date = c("2022-01-18",
"2022-01-18", "2022-01-18", "1/20/2022", "1/20/2022", "2022-02-25",
"2022-03-04", "2022-03-12", "2022-03-15", "2022-03-21", "2022-03-21",
"2022-03-21", "2022-03-21", "2022-03-28", "3/30/2022", "3/30/2022",
"3/30/2022", "2022-04-07")), row.names = c(NA, -18L), class = c("tidytable",
"data.table", "data.frame"))
One thing that is weird about this datset is the date
column. Sometimes, it's in the m/d/y format and other times it's in the y-m-d format. I prefer the second format used in R.
Because the column is mismatched, I have trouble sorting the data frame so the two different date formats are separated out. Is there an operation I can use to make the date format y-m-d all the way through? Something (pseudo-code) like ifelse(date = m/d/y, (transform it to)y-m-d, (otherwise leave it the same as)y-m-d))
?
CodePudding user response:
An alternative lubridate approach you could use:
library(dplyr)
df %>%
mutate(Date = lubridate::parse_date_time(Date, orders = c("ymd", "mdy")))
CodePudding user response:
You can use dplyr::case_when
and stringr::str_detect
to look for "/" and apply lubridate::mdy
if it is found, otherwise apply lubridate::ymd
.
This will convert dates from character format to Date format.
library(dplyr)
library(stringr)
library(lubridate)
data <- data %>%
mutate(newDate = case_when(str_detect(Date, "/") ~ mdy(Date),
TRUE ~ ymd(Date)))
Note: this code generates warnings because it tries to apply both ymd
and mdy
to all values of Date
- but this doesn't really matter, the end result is correct.
Result:
id Date newDate
1 1000 2022-01-18 2022-01-18
2 1000 2022-01-18 2022-01-18
3 1000 2022-01-18 2022-01-18
4 1000 1/20/2022 2022-01-20
5 1000 1/20/2022 2022-01-20
6 1000 2022-02-25 2022-02-25
7 1000 2022-03-04 2022-03-04
8 1000 2022-03-12 2022-03-12
9 1000 2022-03-15 2022-03-15
10 1000 2022-03-21 2022-03-21
11 1000 2022-03-21 2022-03-21
12 1000 2022-03-21 2022-03-21
13 1000 2022-03-21 2022-03-21
14 1000 2022-03-28 2022-03-28
15 1000 3/30/2022 2022-03-30
16 1000 3/30/2022 2022-03-30
17 1000 3/30/2022 2022-03-30
18 1000 2022-04-07 2022-04-07