Home > Mobile >  How to filter rows based on multiple columns satisfying conditions using tidyverse?
How to filter rows based on multiple columns satisfying conditions using tidyverse?

Time:11-14

Data:

> library(tidyverse)
> df <- tibble(id = c(1, 1, 2, 2, 3, 3, 4, 5), condition = c(1, 0, -1, 1, 0, 3, 2, 10), distractor = c(10, 1, 10, 0.1, -5, 70, NA, 0.2))
> df
# A tibble: 8 × 3
     id condition distractor
  <dbl>     <dbl>      <dbl>
1     1         1       10  
2     1         0        1  
3     2        -1       10  
4     2         1        0.1
5     3         0       -5  
6     3         3       70  
7     4         2       NA  
8     5        10        0.2

A data frame that holds selection criterion for filtering rows:

> selection <- tibble(id = c(1, 2, 3), condition = c(1, -1, 0))
> selection
# A tibble: 3 × 2
     id condition
  <dbl>     <dbl>
1     1         1
2     2        -1
3     3         0

How do I filter id from df that match id from selection which satisfy condition?

Expected output:

     id condition distractor
  <dbl>     <dbl>      <dbl>
1     1         1       10  
2     2        -1       10  
3     3         0       -5  

CodePudding user response:

You can use semi_join():

library(dplyr)

 df %>%
   semi_join(selection)

Joining with `by = join_by(id, condition)`
# A tibble: 3 × 3
     id condition distractor
  <dbl>     <dbl>      <dbl>
1     1         1         10
2     2        -1         10
3     3         0         -5
  • Related