I'm writing a function where the user specifies the column they want to filter and what cutoff value they want. In this example, I want to filter out any pretest scores under 2. Here's a sample dataset:
library(dplyr)
test <- tibble(name = c("Corey", "Justin", "Sibley", "Kate"),
pretest_score = c(1:4),
posttest_score = c(5:8),
final_score = c(9:12))
filter_function <- function(data, test_type = c(pretest, posttest, final), value) {
test_character <- deparse(substitute(test_type))
test_score <- paste0(test_character, "_score")
data %>%
filter({{test_score}} > value)
}
filter_function(test, test_type = pretest, value = 2)
I've also tried !!test_score, test_score (with nothing around it), and ensym(test_score) from rlang
, all to no avail.
Note: I know that in this example, I could just specify pretest_score, posttest_score, etc as the test type, but in my real dataset, I have many dimensions for these tests that users can determine cutoffs for (pretest_score, pretest_date, pretest_location, etc.), so it's important that I merge the column prefix with the suffix (here, _score) within the function itself.
Thank you for any help!
CodePudding user response:
Convert the character to sym
bol and evaluate with !!
filter_function <- function(data, test_type = c(pretest, posttest,
final), value) {
test_character <- deparse(substitute(test_type))
test_score <- paste0(test_character, "_score")
data %>%
filter(!! rlang::sym(test_score) > value)
}
-testing
> filter_function(test, test_type = pretest, value = 2)
# A tibble: 2 × 4
name pretest_score posttest_score final_score
<chr> <int> <int> <int>
1 Sibley 3 7 11
2 Kate 4 8 12
CodePudding user response:
A few points:
normally when a set of options are used in R one uses a character vector, not an unevaluated expression. R specifically provides
match.arg
for this purpose. This also implements a default of the first option so if we usematch.arg
the call invoking the function could have omittedtype_test = "pretest"
as that is the default..[[test_score]]
can be used to specify the indicated column
Thus we have
filter_function <- function(data, test_type = c("pretest", "posttest", "final"),
value) {
test_type <- match.arg(test_type)
test_score <- paste0(test_type, "_score")
data %>%
filter(.[[test_score]] > value)
}
filter_function(test, test_type = "pretest", value = 2)
# A tibble: 2 x 4
name pretest_score posttest_score final_score
<chr> <int> <int> <int>
1 Sibley 3 7 11
2 Kate 4 8 12
# pretest is the default
filter_function(test, value = 2)
# A tibble: 2 x 4
name pretest_score posttest_score final_score
<chr> <int> <int> <int>
1 Sibley 3 7 11
2 Kate 4 8 12
Also note that we could specify the function like this. The user can still specify "pretest" since match.arg
will match the leading substring. In fact they could even specify "pre" or "post".
filter_function2 <- function(data,
test_type = c("pretest_score", "posttest_score", "final_score"),
value) {
test_type <- match.arg(test_type)
data %>%
filter(.[[test_type]] > value)
}
filter_function2(test, test_type = "pretest", value = 2)
Base R
This could also be done without any packages like this:
filter_function3 <- function(data, test_type = c("pretest", "posttest", "final"),
value) {
test_type <- match.arg(test_type)
test_score <- paste0(test_type, "_score")
data[data[[test_score]] > value, ]
}
filter_function3(test, test_type = "pretest", value = 2)