I have the following data. I need to filter the group ids which have at least one yes
but NOT Consecutive yes.
data <- data.frame(id=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4, 5,5,5,5),
type=c('No','Yes','No',NA,'Yes','No','Yes','Yes',NA,'Yes','No','Yes','No','Yes',
NA,'No','Yes','Yes','No','No','No','No'))
Expected output:
id type
1 1 No
2 1 Yes
3 1 No
4 1 NA
5 1 Yes
6 3 No
7 3 Yes
8 3 No
9 3 Yes
10 3 NA
I try it using
library (dplyr)
data1 <- data %>% group_by(id) %>%
filter((any((type %in% 'Yes'), na.rm = TRUE))) %>%
mutate(tlag= any(type== 'Yes' & lag(type == 'Yes')))%>%
filter(!any(tlag==T)) %>% select(-tlag)
ungroup
CodePudding user response:
You can use two filter
ing conditions:
library(dplyr)
data %>%
group_by(id) %>%
filter(any(type == "Yes"),
!any(type == "Yes" & lag(type, default = "No") == "Yes", na.rm = T))
output
id type
<dbl> <chr>
1 1 No
2 1 Yes
3 1 No
4 1 NA
5 1 Yes
6 3 No
7 3 Yes
8 3 No
9 3 Yes
10 3 NA
CodePudding user response:
We could check if rle
values appear more than once with by
.
by(data, data$id, \(x)
if (all(is.na(x$type)) || all(na.omit(x$type) == 'No') ||
any(na.omit(with(rle(x$type), lengths[values == "Yes"])) > 1)) NULL
else x) |>
do.call(what=rbind)
# id type
# 1.1 1 No
# 1.2 1 Yes
# 1.3 1 No
# 1.4 1 <NA>
# 1.5 1 Yes
# 3.11 3 No
# 3.12 3 Yes
# 3.13 3 No
# 3.14 3 Yes
# 3.15 3 <NA>