I am passing a tibble to a user-defined function where column names are variables. After studying this, this, and this, I came up with the below working function. My goal is to include an equivalent function in an R package. My question, while this function works, is there a more correct best practice within the dplyr/tidyeval/tidyverse world?
library(tidyverse)
dat0 <- tibble( a = seq(as.Date('2022-02-10'), as.Date('2022-03-01'), by = "5 days")
, b = seq(10,40,10))
myCalc <- function(data, dateIn, numIn, yearOut, numOut) {
data <- data %>%
mutate(.
, {{yearOut}} := lubridate::year(.data[[dateIn]])
, {{numOut}} := 10 * .data[[numIn]]
) %>%
filter(.
, .data[[numOut]] > 250
)
}
dat2 <- myCalc(dat0
, dateIn = "a"
, numIn = "b"
, yearOut = "c"
, numOut = "d")
dat2
# A tibble: 2 × 4
a b c d
<date> <dbl> <dbl> <dbl>
1 2022-02-20 30 2022 300
2 2022-02-25 40 2022 400
CodePudding user response:
Since you are already using the curly-curly {{
operator you can implement that further in your function to have quoted arguments:
myCalc <- function(data, dateIn, numIn, yearOut, numOut) {
data <- data %>%
mutate(.
, {{yearOut}} := lubridate::year({{ dateIn }})
, {{numOut}} := 10 * {{ numIn }}
) %>%
filter(.
, {{ numOut }} > 250
)
return(data)
}
Your use of strings does work (e.g. .data[[dateIn]]
, evaluates to .data[["a"]]
in your example). As mentioned in the comments by @r2evans the difference really comes during the function call.
This function would be called like so (note the lack of quotes in the arguments):
dat2 <- myCalc(dat0,
dateIn = a,
numIn = b,
yearOut = c,
numOut = d)
You can read more about this with ?rlang::`nse-defuse`
and ?rlang::`nse-force`
. There is also this tidyverse article with more on the subject.