I would like to create a new column test
that equals to 1, if "any" of the dates in the visit
column (per ID) falls between the [a,b) interval.
ID birth gender visit a b
1 104 2011-05-28 M 2021-05-04 2016-05-28 2018-05-28
2 104 2011-05-28 M 2018-05-14 2016-05-28 2018-05-28
3 104 2011-05-28 M 2021-06-14 2016-05-28 2018-05-28
4 104 2011-05-28 M 2019-06-17 2016-05-28 2018-05-28
5 104 2011-05-28 M 2017-05-18 2016-05-28 2018-05-28
6 104 2011-05-28 M 2020-11-19 2016-05-28 2018-05-28
7 104 2011-05-28 M 2013-10-22 2016-05-28 2018-05-28
8 104 2011-05-28 M 2021-11-24 2016-05-28 2018-05-28
9 105 2013-04-14 W 2018-06-19 2018-04-14 2020-04-14
10 105 2013-04-14 W 2019-05-27 2018-04-14 2020-04-14
11 106 2010-10-08 M 2017-10-02 2015-10-08 2017-10-08
12 106 2010-10-08 M 2015-05-05 2015-10-08 2017-10-08
13 106 2010-10-08 M 2020-11-05 2015-10-08 2017-10-08
14 106 2010-10-08 M 2021-07-06 2015-10-08 2017-10-08
15 106 2010-10-08 M 2015-01-08 2015-10-08 2017-10-08
16 106 2010-10-08 M 2018-11-08 2015-10-08 2017-10-08
17 106 2010-10-08 M 2021-06-09 2015-10-08 2017-10-08
18 106 2010-10-08 M 2020-02-11 2015-10-08 2017-10-08
19 106 2010-10-08 M 2018-06-11 2015-10-08 2017-10-08
20 106 2010-10-08 M 2019-09-11 2015-10-08 2017-10-08
21 106 2010-10-08 M 2017-04-12 2015-10-08 2017-10-08
22 106 2010-10-08 M 2017-10-12 2015-10-08 2017-10-08
23 106 2010-10-08 M 2018-12-13 2015-10-08 2017-10-08
24 106 2010-10-08 M 2017-03-14 2015-10-08 2017-10-08
25 106 2010-10-08 M 2018-11-16 2015-10-08 2017-10-08
26 106 2010-10-08 M 2013-05-17 2015-10-08 2017-10-08
27 106 2010-10-08 M 2014-06-17 2015-10-08 2017-10-08
28 106 2010-10-08 M 2016-03-18 2015-10-08 2017-10-08
29 106 2010-10-08 M 2017-03-23 2015-10-08 2017-10-08
30 106 2010-10-08 M 2021-11-25 2015-10-08 2017-10-08
31 106 2010-10-08 M 2016-01-26 2015-10-08 2017-10-08
32 106 2010-10-08 M 2014-05-27 2015-10-08 2017-10-08
33 106 2010-10-08 M 2012-11-28 2015-10-08 2017-10-08
34 106 2010-10-08 M 2019-01-30 2015-10-08 2017-10-08
35 106 2010-10-08 M 2017-03-31 2015-10-08 2017-10-08
36 107 2016-06-30 M 2021-11-01 2021-06-30 2023-06-30
37 107 2016-06-30 M 2018-06-08 2021-06-30 2023-06-30
38 107 2016-06-30 M 2022-06-08 2021-06-30 2023-06-30
39 107 2016-06-30 M 2021-10-08 2021-06-30 2023-06-30
40 107 2016-06-30 M 2020-10-09 2021-06-30 2023-06-30
41 107 2016-06-30 M 2020-11-09 2021-06-30 2023-06-30
42 107 2016-06-30 M 2021-11-12 2021-06-30 2023-06-30
43 107 2016-06-30 M 2020-01-13 2021-06-30 2023-06-30
44 107 2016-06-30 M 2022-07-20 2021-06-30 2023-06-30
45 107 2016-06-30 M 2021-10-20 2021-06-30 2023-06-30
46 107 2016-06-30 M 2021-01-22 2021-06-30 2023-06-30
47 107 2016-06-30 M 2021-06-23 2021-06-30 2023-06-30
48 107 2016-06-30 M 2021-10-25 2021-06-30 2023-06-30
49 107 2016-06-30 M 2021-10-29 2021-06-30 2023-06-30
50 107 2016-06-30 M 2018-11-30 2021-06-30 2023-06-30
I tried:
df <- df%>%
group_by(ID)%>%
mutate(test = ifelse(df$visit %within% as.interval(df$a,df$b),1,0))
which is obviously wrong.
CodePudding user response:
For sake of readability, take the extra effort and define your interval manually. This is easy using ave
.
transform(dat, test=as.numeric(ave(visit >= a & visit < b, ID, FUN=any)))
# ID visit a b test
# 1 104 3021-05-04 2016-05-28 2018-05-28 1
# 2 104 2018-05-14 2016-05-28 2018-05-28 1
# 3 104 2021-06-14 2016-05-28 2018-05-28 1
# 4 105 3019-06-17 2016-05-28 2018-05-28 0
# 5 105 3017-05-18 2016-05-28 2018-05-28 0
# 6 105 3020-11-19 2016-05-28 2018-05-28 0
Data:
dat <- structure(list(ID = c(104L, 104L, 104L, 105L, 105L, 105L), visit = c("3021-05-04",
"2018-05-14", "2021-06-14", "3019-06-17", "3017-05-18", "3020-11-19"
), a = c("2016-05-28", "2016-05-28", "2016-05-28", "2016-05-28",
"2016-05-28", "2016-05-28"), b = c("2018-05-28", "2018-05-28",
"2018-05-28", "2018-05-28", "2018-05-28", "2018-05-28")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
CodePudding user response:
You can use any
>=
<
.
coerces the logical vector to a numeric one with TRUE = 1
and FALSE = 0
.
df %>%
group_by(ID) %>%
mutate(test = any(visit >= a & visit < b))
Also, note that you should always provide a minimal and reproducible example. Here, you don't provide the code to reproduce your data set nor is it minimal. You could have chosen two rows per group and 2 groups for instance. Check here for more.