Home > other >  filter tibble with group_by ALL records in vector (without multiple ANY statements)
filter tibble with group_by ALL records in vector (without multiple ANY statements)

Time:12-20

I was just wondering whether it is possible to select (filter) the whole group in a dataframe/tibble based on a vector as input.

This is my example (small) dataset:

# load packages
require(tidyverse)

# make example data
data1 <- tibble(a = "groep", b = c(1,2,3)) 
data2 <- tibble( a = "groep2", b = c(1,2))
data3 <- tibble( a = "groep3", b = c(1,2,3,4))
data4 <- tibble( a = "groep4", b = c(2,3,5))

#combine example data
example <- bind_rows(data1, data2, data3, data4)

Now I would like to extract the groups that contains the numbers 1,2 and 3. A.k.a. data1 and data3. I would like to do this with a group_by function.

I could do this:

example %>% group_by(a) %>% filter(any(b ==1) & any(b==2) & any(b==3)) %>% ungroup

I am satisfied with the output, though I wonder whether there is a less cumbersome method? These are just 3 any's in this block of code. But what if there

vec <- c(1,2,3)

Is it possible to generate the same output using this vector "vec" ?

CodePudding user response:

We could use %in% with all

library(dplyr)
example %>%
   group_by(a) %>%
   filter(all(vec %in% b)) %>%
   ungroup

-output

# A tibble: 7 × 2
  a          b
  <chr>  <dbl>
1 groep      1
2 groep      2
3 groep      3
4 groep3     1
5 groep3     2
6 groep3     3
7 groep3     4
  • Related