Assume we have this dataframe:
> data <- data.frame('a' = c('a', 'b', 'c'),
'b' = c('d', 'e', 'f'),
'c' = c('g', 'h', 'i'),
'd' = c('j', 'k', 'l'))
If I want to filter based on column a to have some value, I can do:
> library(dplyr)
> data %>% filter(a == "b")
a b c d
1 b e h k
If I want to filter based on columns a and b to have some value, I can do:
> library(dplyr)
> data %>% filter(a == "c" & b == "f")
a b c d
1 c f i l
What if I have a list of columns with arbitrary length? Is there a way to do something like this?
> data %>% filter(c(a,b) == c("c","f"))
So I can pass the arbitrary list of column names, alongside with the list of the wanted values, to the filter function.
CodePudding user response:
You could use a data.frame
to keep track of the conditions and use the join syntax instead of filter():
filter_df <- data.frame(a = "c", b = "f")
data |>
inner_join(filter_df)
# a b c d
# 1 c f i l
CodePudding user response:
Maybe this is an overkill for your case, but I took the freedom to allow multiple values to filter. Here is updated data -
library(dplyr)
library(purrr)
data <- data.frame('a' = c('a', 'b', 'c', 'd'),
'b' = c('d', 'e', 'f', 'k'),
'c' = c('g', 'h', 'i', 'e'),
'd' = c('j', 'k', 'l', 'g'))
data
# a b c d
#1 a d g j
#2 b e h k
#3 c f i l
#4 d k e g
Now the filter
condition is the following when done manually.
data %>% filter(a %in% c('c', 'd'), b == 'f')
# a b c d
#1 c f i l
Answer -
To do this in an automated fashion save the condition in a list, use a custom function filter_data
and use it with imap
.
filter_args <- list(a = c('c', 'd'), b = c('f'))
filter_data <- function(dat, val, var) {
dat %>% filter(.data[[var]] %in% val)
}
imap(filter_args, ~filter_data(data, .x, .y)) %>%
reduce(inner_join)
# a b c d
#1 c f i l
CodePudding user response:
I like the inner_join
solution, and this is what I would usually use where possible.
However, there are cases with more complex comparisons in which this doesn’t work. In those cases, you can pass a list of filter expressions to filter
using expression splicing.
For this, you first have to create your list of expressions by interpolating the column names and values into a comparison expression:
cols = c('a', 'b')
values = c('c', 'f')
filters = mapply(
\(c, v) bquote(.(as.name(c)) == .(v)),
cols, values,
USE.NAMES = FALSE
)
And then you can use that list of expressions inside filter
:
data |> filter(!!! filters)
CodePudding user response:
A base R answer:
cond <- Map(`==`, data[c("a", "b")], c("c", "f"))
data[Reduce(`&`, cond), ]
# a b c d
#3 c f i l
That would also work with filter
:
library(dplyr)
data %>%
filter(Reduce(`&`, Map(`==`, .[c("a", "b")], c("c", "f"))))
And translated into tidyverse
functions:
library(purrr)
library(dplyr)
data %>%
filter(map2(.[c("a", "b")], c("c", "f"), `==`) %>%
reduce(`&`))