I want to filter group id after two consecutive "yes" values observed only.
DT<-data.frame(id=c(1,1,1,2,2,2,2,3,3,3,3,3,3,3),
type=c("yes","yes","no","no","yes","yes","no","no","yes","no","yes","yes","no","yes"))
Desired output:
id type
1 yes
1 yes
1 no
2 yes
2 yes
2 no
3 yes
3 yes
3 no
3 yes
I tried it as:
DT<-DT %>% group_by(id) %>%
mutate(Tlag = lag(type), ident = type == 'yes' & Tlag == 'yes' ) %>%
filter(any(ident)) %>% filter(type == 'yes' | (cumsum(type == 'yes') == 1)) %>% select(-ident)
CodePudding user response:
Using dplyr
and kind of an indicator:
DT %>%
group_by(id) %>%
mutate(indic = ifelse(lead(type) == "yes" & type == "yes", 1, NA)) %>%
fill(indic, .direction = 'down') %>%
filter(!is.na(indic)) %>%
select(-indic)
Output:
# A tibble: 10 x 2
# Groups: id [3]
id type
<dbl> <chr>
1 1 yes
2 1 yes
3 1 no
4 2 yes
5 2 yes
6 2 no
7 3 yes
8 3 yes
9 3 no
10 3 yes
CodePudding user response:
You almost had it, i.e. use lead
instead of lag
and adjust your cumsum threshold a bit:
library(tidyverse)
DT %>%
group_by(id) %>%
mutate(Tlead = lead(type),
cumsum_yes = cumsum(type == "yes" & Tlead == "yes"),
cumsum_yes = if_else(is.na(cumsum_yes), lag(cumsum_yes), cumsum_yes)) %>%
filter(cumsum_yes >=1) %>%
select(-cumsum_yes, -Tlead)
which gives:
# A tibble: 10 x 2
# Groups: id [3]
id type
<dbl> <chr>
1 1 yes
2 1 yes
3 1 no
4 2 yes
5 2 yes
6 2 no
7 3 yes
8 3 yes
9 3 no
10 3 yes
Here's the result without deleting the temporary columns so that you can better see what's going on:
# A tibble: 10 x 4
# Groups: id [3]
id type Tlead cumsum_yes
<dbl> <chr> <chr> <int>
1 1 yes yes 1
2 1 yes no 1
3 1 no NA 1
4 2 yes yes 1
5 2 yes no 1
6 2 no NA 1
7 3 yes yes 1
8 3 yes no 1
9 3 no yes 1
10 3 yes NA 1
CodePudding user response:
Using rollapply
from zoo
you can look for any number of 'yes'
value.
library(dplyr)
library(zoo)
n <- 2
DT %>%
group_by(id) %>%
filter(row_number() >= match(TRUE,
rollapply(type == 'yes', n, all, align = 'left', fill = NA))) %>%
ungroup
# id type
# <dbl> <chr>
# 1 1 yes
# 2 1 yes
# 3 1 no
# 4 2 yes
# 5 2 yes
# 6 2 no
# 7 3 yes
# 8 3 yes
# 9 3 no
#10 3 yes