I have a dataset like this,
x time
1 TRUE 10
2 FALSE 20
3 TRUE 11
4 FALSE 10
5 TRUE 16
6 FALSE 2
7 TRUE 17
8 FALSE 6
9 TRUE 11
10 FALSE 7
11 TRUE 20
12 FALSE 3
13 TRUE 10
14 FALSE 4
15 TRUE 2
16 FALSE 10
17 TRUE 3
18 FALSE 6
Using r, I would like to generate a new variable to mark certain conditions based on x and time. Specifically, I would like to search the data from the beginning and when "x is TRUE and time is longer than 15", I would like to find the next row in which "x is FALSE and time is longer than 5", and mark it in a new variable. Do this repeatedly through out the entire dataset.
The output I would like to get is like this.
x time Marker
1 TRUE 10
2 FALSE 20
3 TRUE 11
4 FALSE 10
5 TRUE 16
6 FALSE 2
7 TRUE 17
8 FALSE 6 Meet
9 TRUE 11
10 FALSE 7
11 TRUE 20
12 FALSE 3
13 TRUE 10
14 FALSE 4
15 TRUE 2
16 FALSE 10 Meet
17 TRUE 3
18 FALSE 6
I'm thinking about doing this in a loop in r because I have a very long dataset, but can not figure this out. Any advice would be appreciated.
CodePudding user response:
mark_first <- function(x) {
out <- rep('', length(x))
if (!any(x)) return(out)
out[which.max(x)] <- "Meet"
return(out)
}
d %>%
group_by(g = cumsum(x & (time > 15))) %>%
mutate(Marker = mark_first(!x & time < 5))
# A tibble: 18 × 4 # Groups: g [4] x time g Marker <lgl> <int> <int> <chr> 1 TRUE 10 0 "" 2 FALSE 20 0 "" 3 TRUE 11 0 "" 4 FALSE 10 0 "" 5 TRUE 16 1 "" 6 FALSE 2 1 "Meet" 7 TRUE 17 2 "" 8 FALSE 6 2 "" 9 TRUE 11 2 "" 10 FALSE 7 2 "" 11 TRUE 20 3 "" 12 FALSE 3 3 "Meet" 13 TRUE 10 3 "" 14 FALSE 4 3 "" 15 TRUE 2 3 "" 16 FALSE 10 3 "" 17 TRUE 3 3 "" 18 FALSE 6 3 ""
CodePudding user response:
Assuming your data.frame is called d
look <- FALSE
d$Marker <- NA
for(i in 1:nrow(d)){
if(d$x[i] & d$time[i] > 15){
look <- TRUE
next
}
if(look){
if(!d$x[i] & d$time[i] > 5){
d$Marker[i] <- "Meet"
look <- FALSE
}
}
}