I have a long function that uses a dataframe column name as an input and am trying to apply it to several different column names without a new line of code each time. I am having issues with tidyselect within the function called by map. I believe the issue is related to defusing, but I cannot figure it out. A toy example using mtcars data is below.
This works correctly with map:
library(tidyverse)
sum_dplyr <- function(df, x) {
res <- df %>% summarise(mean = mean({{x}}, na.rm = TRUE))
return(res)
}
sum_dplyr(mtcars, disp)
map(names(mtcars), ~ sum_dplyr(mtcars, mtcars[[.]])) # all columns -> works fine
While this gives the error "Must subset columns with a valid subscript vector" when feeding the function through map:
library(tidyverse)
sel_dplyr <- function(df, x) {
res <- df %>% dplyr::select({{x}})
return(res)
}
sel_dplyr(mtcars, disp) # ok
map(names(mtcars), ~ sel_dplyr(mtcars, mtcars[[.]])) # all columns -> error
What am I missing here ? Many thanks !
CodePudding user response:
It may be better to correct the function to make sure that it takes both unquoted and quoted. With map
, we are passing a character string. So, instead of {{}}
, can use ensym
with !!
sum_dplyr <- function(df, x) {
x <- rlang::ensym(x)
res <- df %>%
summarise(mean = mean(!!x, na.rm = TRUE))
return(res)
}
Similarly for sel_dplyr
sel_dplyr <- function(df, x) {
x <- rlang::ensym(x)
res <- df %>%
dplyr::select(!! x)
return(res)
}
and then test as
library(purrr)
library(dplyr)
map(names(mtcars), ~ sel_dplyr(mtcars, !!.x))
sel_dplyr(mtcars, carb)