I am trying to create a function in R. I am a bit confused on how I am supposed to use the variables from the data in the function. Say I have a data frame of:
tb <- tibble(x = 1:5)
and I create a function:
f <- function(data, z){
# some function
}
I then want to be able to just do:
f(data = tb, z = x)
but I get "Error in f(data = tb, z = x): object x not found". I want to be able to do something like the lm function where I tell it the data and can then reference the variables in the function as so:
lm(mpg~cyl disp, data = mtcars)
But I'm not sure how to make this work inside the function. Any help on how I can do this or if this makes sense?
CodePudding user response:
The question is not tagged dplyr
nor tidyverse
so I assume base R answers are on topic.
Base R
With base R to get a variable passed unquoted, therefore unevaluated, to a function, use deparse(substitute(.))
in the function. This also works with the new pipe operator introduced in R 4.1.
tb <- data.frame(x = 1:5)
f <- function(data, z){
z <- deparse(substitute(z))
data[[z]] * 2
}
f(tb, z = x)
#> [1] 2 4 6 8 10
tb |> f(z = x)
#> [1] 2 4 6 8 10
Created on 2022-03-09 by the reprex package (v2.0.1)
Package dplyr
With package dplyr
the function above works but the idiomatic way is with double braces {{
.
suppressPackageStartupMessages(library(dplyr))
g <- function(data, z){
data %>% mutate(x = {{z}} * 2)
}
tb %>% f(x)
#> [1] 2 4 6 8 10
tb %>% g(x)
#> x
#> 1 2
#> 2 4
#> 3 6
#> 4 8
#> 5 10
Created on 2022-03-09 by the reprex package (v2.0.1)