Home > front end >  How to filter for rows that match some but not all conditions in R
How to filter for rows that match some but not all conditions in R

Time:11-09

I have a data frame with a given number of columns, say 5 for example. I have a condition for each of the columns and want to select the rows which match 4 out of 5 conditions.

For a simple example imagine I wanted the rows where the value for at least 3 of columns A to E is greater than 1.

I know how to filter using tidyverse for specific conditions i.e if column A > 1 and column B < 5, but not sure how to filter for rows that meet some but not all of the conditions that I set. Perhaps a rather simple question but I can't find an immediate answer online and am under a bit of time pressure. I am very much a beginner so if possible keep explanations as simple as possible. Thanks!

CodePudding user response:

As boolean values can be turned into 0 or 1 (numeric), you can add together your 5 conditions and check if that sum is greater than 5:

df = as_tibble(replicate(5, sample(1:10)))

df %>%
  mutate(cond = (V1>5)   (V2>2)   (V3<4)   (V4>7)   (V5<2)) %>%
  filter(cond >= 4)

# A tibble: 3 x 6
     V1    V2    V3    V4    V5  cond
  <int> <int> <int> <int> <int> <int>
1     9    10     3     8     3     4
2     1     7     1    10     6     3
3    10     8     2     9     2     4

Obs: you can do it in once, I just separated it so you can see the sum column.

df %>% filter((V1>5)   (V2>2)   (V3<4)   (V4>7)   (V5<2) >= 4)
  • Related