Home > Software design >  Using tidy eval for multiple, arbitrary filter conditions
Using tidy eval for multiple, arbitrary filter conditions

Time:10-16

I would like to use tidy evaluation to write multiple, entirely flexible filter conditions. A related, but less complex problem has been solved in this Stackoverflow Question. The following code (which is an adaption from the mentioned other question) is working. It applies two filter conditions to the gapminder data set, and returns the filtered data.

library(tidyverse)
library(gapminder)

my_filter <- function(df, cols, vals){    
  paste_filter <- function(x, y) quo(!!sym(x) %in% {{y}})
  fp <- pmap(list(cols, vals), paste_filter)
  filter(df, !!!fp)
}

cols <- list("country", "year")
vals = list(c("Albania", "France"), c(2002, 2007))
gapminder %>% my_filter(cols, vals) 

The problem: So far this solution is restricted to one type of filter operator (%in%). I would like to extend this approach to accept arbitrary types of operators (==, %in%, >, ...). The intended function my_filter is supposed to handle the following:

cols <- list("country", "year")
ops <- list("%in%", ">=")
vals = list(c("Albania", "France"), 2007))
gapminder %>% my_filter(cols, ops, vals)

The use case that I have in the back of my mind are shiny apps. Using such a functionality, we could more easily let users set arbitrary filter conditions on variables of the data set.

CodePudding user response:

Create a list of calls and splice them in:

library(dplyr)
library(gapminder)

cols <- list("country", "year")
ops <- list("%in%", ">=")
vals <- list(c("Albania", "France"), 2007)

# Assumes LHS is the name of a variable and OP is
# the name of a function
op_call <- function(op, lhs, rhs) {
  call(op, sym(lhs), rhs)
}

my_filter <- function(data, cols, ops, vals) {
  exprs <- purrr::pmap(list(ops, cols, vals), op_call)
  data %>% dplyr::filter(!!!exprs)
}

gapminder %>% my_filter(cols, ops, vals)
#> # A tibble: 2 × 6
#>   country continent  year lifeExp      pop gdpPercap
#>   <fct>   <fct>     <int>   <dbl>    <int>     <dbl>
#> 1 Albania Europe     2007    76.4  3600523     5937.
#> 2 France  Europe     2007    80.7 61083916    30470.
  • Related