I have this dataframe:
df <- data.frame("dim" = c(1,1,1,1), "pub" = c(0,0,1,1), "sco" = c(0,0,0,0), "wos" = c(1,1,1,0))
I want to filter it by dynamically choosing which of the columns should have 1
or 0
as their value.
Example:
yes <- c("dim", "wos") # these columns should have value 1
no <- c("pub", "sco") # these columns should have value 0
df %>%
filter(if_all(yes) == 1 & if_all(no) == 0)
However, the result is not correct; it does show a row where the column pub
has 1
as its value. How comes? How can I obtain the result I want to achieve (i.e., the following without row #3)?
# wrong result
dim pub sco wos
1 1 0 0 1
2 1 0 0 1
3 1 1 0 1 # <- this should not appear!
CodePudding user response:
You could also use across
:
library(dplyr)
df |>
filter(rowSums(across(yes)) == length(yes) & rowSums(across(no)) == 0)
Or rowwise
:
library(dplyr)
df |>
rowwise() |>
filter(sum(c_across(yes)) == length(yes) && sum(c_across(no)) == 0) |>
ungroup()
CodePudding user response:
The comparison should be included inside if_all
. When using external vectors as column names it is good practice to use all_of
.
library(dplyr)
df %>% filter(if_all(all_of(yes), ~. == 1) & if_all(all_of(no), ~. == 0))
# dim pub sco wos
#1 1 0 0 1
#2 1 0 0 1
In base R, you can use rowSums
-
df[rowSums(df[yes] == 1) == length(yes) & rowSums(df[no] == 0) == length(no), ]