Home > Enterprise >  dplyr filter - how to filter on the complement of multiple logical conditions
dplyr filter - how to filter on the complement of multiple logical conditions

Time:12-29

Maybe I don't understand how filter works in dplyr but I thought you could use ! in front of a condition to filter on the complement. The first condition below correctly returns n = 203, but I want to exclude those from the dataset. Is it not possible to do this with ! and filter as I then attempted (which incorrectly returns 1056 observations instead of 1607 - 203 = 1404).

Otherwise, how do I easily do this?

> dat |> 
    summarise(n = n())
# A tibble: 1 × 1
      n
  <int>
1  1607
> 
> dat |> 
    filter(max_num > 1 & base_end_diff < 0 | max_num > 1 & mostrecent_start_diff < 0) |> 
    summarise(n = n())
# A tibble: 1 × 1
      n
  <int>
1   203
> 
> dat |> 
    filter(!(max_num > 1 & base_end_diff < 0 | max_num > 1 & mostrecent_start_diff < 0)) |> 
    summarise(n = n())
# A tibble: 1 × 1
      n
  <int>
1  1056

CodePudding user response:

You must have NA in your dataset:

Here is an example:

> mtcars1[10:20,]<-NA
> 
> mtcars1 |>
    summarise(n = n())
   n
1 32
> 
> mtcars1 |>
    filter(mpg > 15 & gear < 4 | mpg > 15 & gear < 4) |>
    summarise(n = n())
  n
1 7
> 
> mtcars1 |>
    filter(!(mpg > 15 & gear < 4 | mpg > 15 & gear < 4)) |>
    summarise(n = n())
   n
1 14
  • Related