I have 3 date columns (class-date) and I want to create a new column that will have the earliest of the 3 dates. This is the code I used below:
df1 <- df %>% mutate(timeout= pmin(date1, date2, end_date))
In the case that date1
and date2
are NA
s, then I would like the date in end_date
to be returned in the timeout
column and therefore timeout
should not have any NA
s. The code above is bringing back NA
s. Any assistance will be greatly appreciated.
CodePudding user response:
You can add na.rm = TRUE
, then it will ignore the NA
s in each row when calculating pmin
.
library(dplyr)
df %>%
mutate(timeout = pmin(date1, date2, end_date, na.rm = TRUE))
Output
id date1 date2 end_date timeout
1 1 <NA> <NA> 2008-01-23 2008-01-23
2 1 2007-10-16 2007-11-01 2008-01-23 2007-10-16
3 2 2007-11-30 2007-11-30 2007-11-30 2007-11-30
4 3 2007-08-17 2007-12-17 2008-12-12 2007-08-17
5 3 2008-11-12 2008-12-12 2008-12-12 2008-11-12
Data
df <- structure(list(id = c(1L, 1L, 2L, 3L, 3L), date1 = structure(c(NA,
13802, 13847, 13742, 14195), class = "Date"), date2 = structure(c(NA,
13818, 13847, 13864, 14225), class = "Date"), end_date = c("2008-01-23",
"2008-01-23", "2007-11-30", "2008-12-12", "2008-12-12")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))