Home > Back-end >  Creating a wrapper function that uses tidyverse-like syntax
Creating a wrapper function that uses tidyverse-like syntax

Time:09-26

I'm trying to create a wrapper function that will let me use tidyverse-like syntax (i.e. replacing dat$col with dat, col) whilst simultaneously adding some default arguments. I'm struggling with the first part - likely because I don't have a good grasp of base R (and maybe data masking?)

# What I would like to recreate
summary(mtcars$mpg)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   10.40   15.43   19.20   20.09   22.80   33.90

# My attempt
tidy_summary <- function(data, variable) {
        variable_name <- {{  variable  }}
        summary(data$variable_name)
}

tidy_summary(mtcars, mpg)
#> Error in tidy_summary(mtcars, mpg): object 'mpg' not found

Created on 2022-09-26 by the reprex package (v2.0.1)

I tried using the [ operator instead, or passing the arguments as strings, without luck.

CodePudding user response:

You can do

tidy_summary <- function(data, variable) {
  
  summary(data[[deparse(substitute(variable))]])
}

tidy_summary(mtcars, mpg)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   10.40   15.43   19.20   20.09   22.80   33.90

The $ operator doesn't work here because this is always interpreted as-is. That is, data$variable_name will always look for a column literally called variable_name. It will not be substituted for the name stored in the variable variable_name. Instead, we use the [[ operator and pass a string containing the actual column name, which we can get by deparse(substitute(variable))

Created on 2022-09-26 with reprex v2.0.2

CodePudding user response:

Allan´s answer is the way to go. Just for complement, there was discussion on creating a better(?) summary() function, one that could take colunms as arguments. One could use skimr::skim() the same way you were trying with summary():

tidy_summary <- function(data, variable) {
        variable_name <- {{variable}}
        skimr::skim(mtcars, variable_name)
}

tidy_summary(mtcars, "mpg")

Or if you want to be able to avoid passing the columns as strings, ensym() will do the trick:

tidy_summary <- function(data, variable) {
        variable_name <- ensym(variable)
        skimr::skim(mtcars, variable_name)
}

tidy_summary(mtcars, mpg)

More on quasiquotation on Advanced R Chapter 19.

  • Related