Home > Back-end >  Complex within-group conditionals in dplyr
Complex within-group conditionals in dplyr

Time:03-08

In this post I got some great guidance on how to use the all() function within dplyr to capture when a group has both of two conditions within it.

Now I'd like to move onto a more complex conditional.

Take the following data frame

df <- data.frame(id = factor(c(1, 2, 2, 3, 3, 4, 4)),
                 l = factor(c("a", "a", "b", "a", "c", "b", "d")))

df

#   id l
# 1  1 a
# 2  2 a
# 3  2 b
# 4  3 a
# 5  3 c
# 6  4 b
# 7  4 d

Now based on the advice I got in my previous post I can get a TRUE for the id numbers that have presence of a AND c using all()

df %>%
  group_by(id) %>%
    summarise(ab = all(c("a", "c") %in% l))

#   id    ab   
#   <fct> <lgl>
# 1 1     FALSE
# 2 2     FALSE
# 3 3     TRUE 
# 4 4     FALSE

Now what do I do if I want to pick up id numbers with EITHER (a AND c) OR (a AND b) (i.e. id numbers 2 and 3 in our example)?

Any help much appreciated

CodePudding user response:

df %>%
  group_by(id) %>%
  summarise(ab = all(c("a", "c") %in% l) | all(c("a", "b") %in% l))

Output:

# A tibble: 4 × 2
  id    ab   
  <fct> <lgl>
1 1     FALSE
2 2     TRUE 
3 3     TRUE 
4 4     FALSE
  • Related