In this post I got some great guidance on how to use the all()
function within dplyr to capture when a group has both of two conditions within it.
Now I'd like to move onto a more complex conditional.
Take the following data frame
df <- data.frame(id = factor(c(1, 2, 2, 3, 3, 4, 4)),
l = factor(c("a", "a", "b", "a", "c", "b", "d")))
df
# id l
# 1 1 a
# 2 2 a
# 3 2 b
# 4 3 a
# 5 3 c
# 6 4 b
# 7 4 d
Now based on the advice I got in my previous post I can get a TRUE for the id numbers that have presence of a
AND c
using all()
df %>%
group_by(id) %>%
summarise(ab = all(c("a", "c") %in% l))
# id ab
# <fct> <lgl>
# 1 1 FALSE
# 2 2 FALSE
# 3 3 TRUE
# 4 4 FALSE
Now what do I do if I want to pick up id numbers with EITHER (a
AND c
) OR (a
AND b
) (i.e. id numbers 2 and 3 in our example)?
Any help much appreciated
CodePudding user response:
df %>%
group_by(id) %>%
summarise(ab = all(c("a", "c") %in% l) | all(c("a", "b") %in% l))
Output:
# A tibble: 4 × 2
id ab
<fct> <lgl>
1 1 FALSE
2 2 TRUE
3 3 TRUE
4 4 FALSE