Home > OS >  create variable if one date larger than another date column
create variable if one date larger than another date column

Time:03-21

I want to create a binary variable of 0 or 1, based on if one date variable is larger than another. Here is some example data

diagnosis_date1 diagnosis_date2
30-01-2018      12-05-2016
14-02-2013      06-07-2020
01-02-2009      21-11-2008
09-06-2012      16-10-2014

with the goal of then having a variable that indicates 1 if diagnosis_date1 is larger than date2 and 0 if the date is less than

diagnosis_date1 diagnosis_date2 condition
30-01-2018      12-05-2016      1
14-02-2013      06-07-2020      0
01-02-2009      21-11-2008      1
09-06-2012      16-10-2014      0

I have tried the following but it just produces all NAs, even though I am sure it worked in another dataset previously for something similar.

df$condition<- ifelse(df$diagnosis_date1 < df$diagnosis_date2, 1, 0)

Any help much appreciated :)

CodePudding user response:

We need to convert the columns to Date class instead of the the character/factor class

df[1:2] <- lapply(df[1:2], as.Date, format = '%d-%m-%Y')

and now the code should work

df$condition <-  (df$diagnosis_date1 >= df$diagnosis_date2)

-output

> df
  diagnosis_date1 diagnosis_date2 condition
1      2018-01-30      2016-05-12         1
2      2013-02-14      2020-07-06         0
3      2009-02-01      2008-11-21         1
4      2012-06-09      2014-10-16         0

data

df <- structure(list(diagnosis_date1 = c("30-01-2018", "14-02-2013", 
"01-02-2009", "09-06-2012"), diagnosis_date2 = c("12-05-2016", 
"06-07-2020", "21-11-2008", "16-10-2014")), 
class = "data.frame", row.names = c(NA, 
-4L))

CodePudding user response:

You could look if the differences are greater than zero, using `-` in a do.call.

 (do.call(`-`, lapply(dat, as.Date, '%d-%m-%Y')) > 0)
# [1] 1 0 1 0
  • Related