Home > front end >  Getting variables from data for functions in R
Getting variables from data for functions in R

Time:03-09

I am trying to create a function in R. I am a bit confused on how I am supposed to use the variables from the data in the function. Say I have a data frame of:

tb <- tibble(x = 1:5)

and I create a function:

f <- function(data, z){
# some function
}

I then want to be able to just do:

f(data = tb, z = x)

but I get "Error in f(data = tb, z = x): object x not found". I want to be able to do something like the lm function where I tell it the data and can then reference the variables in the function as so:

lm(mpg~cyl disp, data = mtcars)

But I'm not sure how to make this work inside the function. Any help on how I can do this or if this makes sense?

CodePudding user response:

The question is not tagged dplyr nor tidyverse so I assume base R answers are on topic.

Base R

With base R to get a variable passed unquoted, therefore unevaluated, to a function, use deparse(substitute(.)) in the function. This also works with the new pipe operator introduced in R 4.1.

tb <- data.frame(x = 1:5)

f <- function(data, z){
  z <- deparse(substitute(z))
  data[[z]] * 2
}

f(tb, z = x)
#> [1]  2  4  6  8 10

tb |> f(z = x)
#> [1]  2  4  6  8 10

Created on 2022-03-09 by the reprex package (v2.0.1)

Package dplyr

With package dplyr the function above works but the idiomatic way is with double braces {{.

suppressPackageStartupMessages(library(dplyr))

g <- function(data, z){
  data %>% mutate(x = {{z}} * 2)
}

tb %>% f(x)
#> [1]  2  4  6  8 10

tb %>% g(x)
#>    x
#> 1  2
#> 2  4
#> 3  6
#> 4  8
#> 5 10

Created on 2022-03-09 by the reprex package (v2.0.1)

  • Related