My understanding is that R's scoping will always try to assign values to free variables within a function by searching the environment within which the function is defined and then searching parent environments. However, I am seeking assistance reconciling this with why I don't receive an error from a function call.
Suppose I define a function foo
in the global environment and pass it arguments that are either objects (e.g., a data.frame
) in the global environment or the unquoted names of elements of that object.
library(dplyr)
# Example input objects
dv <- "c"
df <- data.frame(x = rep(c(3,NA_real_), 5),
y = letters[1:10],
z = 1:10)
# Define a function
foo <- function(df, dv, response, treat) {
df %>%
filter(y %in% dv) %>%
filter(!is.na(response)) %>%
select(treat)
}
My understanding is that y
is a free variable here and I should expect R will look for y
in the global environment where foo
was defined, find nothing, and throw an error. However, any errors/warnings are unrelated to y
:
foo(df = df, dv = dv, response = x, treat = z)
#> Error in `filter()`:
#> ! Problem while computing `..1 = !is.na(response)`.
#> Caused by error in `mask$eval_all_filter()`:
#> ! object 'x' not found
While we can fix those scoping errors by quoting and unquoting (per below), it remains unclear to me how y
is recognized as an unquoted column name and not producing an error.
foo_new <- function(df, dv, response, treat) {
response <- enquo(response)
treat <- enquo(treat)
df %>%
filter(y %in% dv) %>%
filter(!is.na(!!response)) %>%
select(!!treat)
}
foo_new(df, dv, x, z)
#> z
#> 1 3
CodePudding user response:
It might help to make things more explicit, in regards to quoted vs. unquoted expressions and the environments from where objects are coming. If I were to roll up foo
into an R package, this is what I'd do (using roxygen2
comments to make the type of function arguments explicit).
#' Test function
#'
#' @param df A `data.frame`.
#' @param dv A `character` scalar.
#' @param response An unquoted expression corresponding to a column in `df`.
#' @param treat An unquoted expression corresponding to a column in `df`.
#'
#' @importFrom magrittr "%>%"
#' @importFrom rlang .data
foo_explicit <- function(df, dv, response, treat) {
df %>%
filter(.data$y %in% dv) %>%
filter(!is.na({{ response }})) %>%
select({{ treat }})
}
A few comments:
.data$y
insidefilter
makes it explicit thaty
is a column withindf
.- The
dv
argument is acharacter
scalar within thefoo_explicit
environment. - The
response
andtreat
arguments are unquoted expressions. The curly-curly operator is just a short-cut to theenquo
!!
construct that you use infoo_new
.