I want to calculate exposure time (days) from INDX to DS,RE,SE whichever comes first. If all (DS,RE,SE) is NA then time to a fixed date (2015-01-01), should be calculated.
Data:
DF<-tibble::tribble(
~ID, ~INDX, ~DS, ~RE, ~SE,
1L, "2001-01-01", "2002-02-02", "2003-03-03", NA,
2L, "2002-02-02", NA, "2001-01-01", "2002-02-02",
3L, "2003-03-03", "2009-09-09", NA, "2010-10-10",
4L, "2001-01-01", NA, NA, NA
)
DF%>%mutate_at(vars(2,3,4,5), as.Date)
# A tibble: 4 × 5
ID INDX DS RE SE
<int> <date> <date> <date> <date>
1 1 2001-01-01 2002-02-02 2003-03-03 NA
2 2 2002-02-02 NA 2001-01-01 2002-02-02
3 3 2003-03-03 2009-09-09 NA 2010-10-10
4 4 2001-01-01 NA NA NA
>
Desired output:
# A tibble: 4 × 6
ID INDX DS RE SE TIME
<int> <date> <date> <date> <date> <int>
1 1 2001-01-01 2002-02-02 2003-03-03 NA
2 2 2002-02-02 NA 2001-01-01 2002-02-02
3 3 2003-03-03 2009-09-09 NA 2010-10-10
4 4 2001-01-01 NA NA NA
Which is the easiest way?
Regards, H
CodePudding user response:
newDF <- DF%>%mutate_at(vars(2,3,4,5), as.Date)
newDF %>%
mutate(time2use=pmin(DS, RE, SE, na.rm=T)) %>%
mutate(TIME = abs(INDX-time2use)) %>%
mutate(TIME=ifelse(is.na(TIME), abs(INDX-as.Date("2015-01-01")), TIME)) %>%
select(c(-6))
ID INDX DS RE SE TIME
<int> <date> <date> <date> <date> <dbl>
1 1 2001-01-01 2002-02-02 2003-03-03 NA 397
2 2 2002-02-02 NA 2001-01-01 2002-02-02 397
3 3 2003-03-03 2009-09-09 NA 2010-10-10 2382
4 4 2001-01-01 NA NA NA 5113
CodePudding user response:
An option to do it is by means of ifelse conditions like this:
DF$TIME <- ifelse(!is.na(DF$DS), as.Date(DF$DS) - as.Date(DF$INDX),
ifelse(!is.na(DF$RE), as.Date(DF$RE) - as.Date(DF$INDX),
ifelse(!is.na(DF$SE), as.Date(DF$SE) - as.Date(DF$INDX), as.Date("2015-01-01") - as.Date(DF$INDX))))
Output:
DF
# A tibble: 4 x 6
ID INDX DS RE SE TIME
<int> <chr> <chr> <chr> <chr> <dbl>
1 1 2001-01-01 2002-02-02 2003-03-03 NA 397
2 2 2002-02-02 NA 2001-01-01 2002-02-02 -397
3 3 2003-03-03 2009-09-09 NA 2010-10-10 2382
4 4 2001-01-01 NA NA NA 5113