I want to create a binary variable of 0 or 1, based on if one date variable is larger than another. Here is some example data
diagnosis_date1 diagnosis_date2
30-01-2018 12-05-2016
14-02-2013 06-07-2020
01-02-2009 21-11-2008
09-06-2012 16-10-2014
with the goal of then having a variable that indicates 1 if diagnosis_date1 is larger than date2 and 0 if the date is less than
diagnosis_date1 diagnosis_date2 condition
30-01-2018 12-05-2016 1
14-02-2013 06-07-2020 0
01-02-2009 21-11-2008 1
09-06-2012 16-10-2014 0
I have tried the following but it just produces all NAs, even though I am sure it worked in another dataset previously for something similar.
df$condition<- ifelse(df$diagnosis_date1 < df$diagnosis_date2, 1, 0)
Any help much appreciated :)
CodePudding user response:
We need to convert the columns to Date
class instead of the the character/factor
class
df[1:2] <- lapply(df[1:2], as.Date, format = '%d-%m-%Y')
and now the code should work
df$condition <- (df$diagnosis_date1 >= df$diagnosis_date2)
-output
> df
diagnosis_date1 diagnosis_date2 condition
1 2018-01-30 2016-05-12 1
2 2013-02-14 2020-07-06 0
3 2009-02-01 2008-11-21 1
4 2012-06-09 2014-10-16 0
data
df <- structure(list(diagnosis_date1 = c("30-01-2018", "14-02-2013",
"01-02-2009", "09-06-2012"), diagnosis_date2 = c("12-05-2016",
"06-07-2020", "21-11-2008", "16-10-2014")),
class = "data.frame", row.names = c(NA,
-4L))
CodePudding user response:
You could look if the differences are greater than zero, using `-`
in a do.call
.
(do.call(`-`, lapply(dat, as.Date, '%d-%m-%Y')) > 0)
# [1] 1 0 1 0