Home > Enterprise >  Multiple if_all() in dplyr::filter not working
Multiple if_all() in dplyr::filter not working

Time:07-12

I have this dataframe:

df <- data.frame("dim" = c(1,1,1,1), "pub" = c(0,0,1,1), "sco" = c(0,0,0,0), "wos" = c(1,1,1,0))

I want to filter it by dynamically choosing which of the columns should have 1 or 0 as their value.

Example:

yes <- c("dim", "wos")  # these columns should have value 1
no <- c("pub", "sco")   # these columns should have value 0

df %>%
   filter(if_all(yes) == 1 & if_all(no) == 0)

However, the result is not correct; it does show a row where the column pub has 1 as its value. How comes? How can I obtain the result I want to achieve (i.e., the following without row #3)?

# wrong result 
  dim pub sco wos
1   1   0   0   1
2   1   0   0   1
3   1   1   0   1   # <- this should not appear!

CodePudding user response:

You could also use across:

library(dplyr)
df |>
  filter(rowSums(across(yes)) == length(yes) & rowSums(across(no)) == 0)

Or rowwise:

library(dplyr)

df |>
  rowwise() |>
  filter(sum(c_across(yes)) == length(yes) && sum(c_across(no)) == 0) |>
  ungroup()

CodePudding user response:

The comparison should be included inside if_all. When using external vectors as column names it is good practice to use all_of.

library(dplyr)

df %>% filter(if_all(all_of(yes),  ~. == 1) & if_all(all_of(no), ~. == 0))

#  dim pub sco wos
#1   1   0   0   1
#2   1   0   0   1

In base R, you can use rowSums -

df[rowSums(df[yes] == 1) == length(yes) & rowSums(df[no] == 0) == length(no), ]
  • Related