Suppose we have the following data:
tib <- tibble::tibble(x = 1:10)
Then, suppose we want to make a function that takes a column as input and returns a tibble
with several added columns such as:
library(dplyr)
generate_transformations <- function(data, column){
transform <- sym(column)
data %>%
mutate(
sqrt = sqrt(!!transform),
recip = 1 / !!transform,
log = log(!!transform)
)
}
# Usage is great:
tib %>%
generate_transformations('x')
# A tibble: 10 x 4
x sqrt recip log
<int> <dbl> <dbl> <dbl>
1 1 1 1 0
2 2 1.41 0.5 0.693
3 3 1.73 0.333 1.10
4 4 2 0.25 1.39
5 5 2.24 0.2 1.61
6 6 2.45 0.167 1.79
7 7 2.65 0.143 1.95
8 8 2.83 0.125 2.08
9 9 3 0.111 2.20
10 10 3.16 0.1 2.30
Now my question is, is there a way to avoid unquoting (!!
) transform
repeatedly?
Yes, I could, e.g., temporarily rename column
and then rename it back after I am done, but that is not my interest in this question.
I am interested if there is a way to produce a variable that does not need the !!
.
While it does not work, I was looking for something like:
generate_transformations <- function(data, column){
transform <- !!sym(column) # cannot unquote here :(
data %>%
mutate(
sqrt = sqrt(transform),
recip = 1 / transform,
log = log(transform)
)
}
CodePudding user response:
Convert to string and subset from the data
and use transform
generate_transformations <- function(data, column){
transform <- data[[rlang::as_string(ensym(column))]]
data %>%
mutate(
sqrt = sqrt(transform),
recip = 1 / transform,
log = log(transform)
)
}
-testing
tib %>%
generate_transformations('x')
# A tibble: 10 × 4
x sqrt recip log
<int> <dbl> <dbl> <dbl>
1 1 1 1 0
2 2 1.41 0.5 0.693
3 3 1.73 0.333 1.10
4 4 2 0.25 1.39
5 5 2.24 0.2 1.61
6 6 2.45 0.167 1.79
7 7 2.65 0.143 1.95
8 8 2.83 0.125 2.08
9 9 3 0.111 2.20
10 10 3.16 0.1 2.30
Or create a temporary column and remove it later
generate_transformations <- function(data, column){
data %>%
mutate(transform = !! rlang::ensym(column),
sqrt = sqrt(transform),
recip = 1 / transform,
log = log(transform),
transform = NULL
)
}
-testing
tib %>%
generate_transformations('x')
# A tibble: 10 × 4
x sqrt recip log
<int> <dbl> <dbl> <dbl>
1 1 1 1 0
2 2 1.41 0.5 0.693
3 3 1.73 0.333 1.10
4 4 2 0.25 1.39
5 5 2.24 0.2 1.61
6 6 2.45 0.167 1.79
7 7 2.65 0.143 1.95
8 8 2.83 0.125 2.08
9 9 3 0.111 2.20
10 10 3.16 0.1 2.30
CodePudding user response:
You can do it in one, if you swap !!
for {{}}
and use across
:
data_transformations <- function(d, col, funs=list(sqrt=sqrt, log=log, recip=~1/.)) {
d %>% mutate(across({{col}}, .fns=funs))
}
d %>% data_transformations(x)
# A tibble: 10 × 4
x x_sqrt x_log x_recip
<int> <dbl> <dbl> <dbl>
1 1 1 0 1
2 2 1.41 0.693 0.5
3 3 1.73 1.10 0.333
4 4 2 1.39 0.25
5 5 2.24 1.61 0.2
6 6 2.45 1.79 0.167
7 7 2.65 1.95 0.143
8 8 2.83 2.08 0.125
9 9 3 2.20 0.111
10 10 3.16 2.30 0.1
To restore your original column names, use
data_transformations <- function(d, col, funs=list(sqrt=sqrt, log=log, recip=~1/.)) {
d %>% mutate(across({{col}}, .fns=funs, .names="{.fn}"))
}
d %>% data_transformations(x)
# A tibble: 10 × 4
x sqrt log recip
<int> <dbl> <dbl> <dbl>
1 1 1 0 1
2 2 1.41 0.693 0.5
3 3 1.73 1.10 0.333
4 4 2 1.39 0.25
5 5 2.24 1.61 0.2
6 6 2.45 1.79 0.167
7 7 2.65 1.95 0.143
8 8 2.83 2.08 0.125
9 9 3 2.20 0.111
10 10 3.16 2.30 0.1
To handle multiple columns:
data_transformations <- function(d, cols, funs=list(sqrt=sqrt, log=log, recip=~1/.)) {
d %>% mutate(across({{cols}}, .fns=funs))
}
d1 <- tibble(x=1:10, y=seq(2, 20, 2))
d1 %>% data_transformations(c(x, y), list(sqrt=sqrt, log=log))
A tibble: 10 × 6
x y x_sqrt x_log y_sqrt y_log
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 2 1 0 1.41 0.693
2 2 4 1.41 0.693 2 1.39
3 3 6 1.73 1.10 2.45 1.79
4 4 8 2 1.39 2.83 2.08
5 5 10 2.24 1.61 3.16 2.30
6 6 12 2.45 1.79 3.46 2.48
7 7 14 2.65 1.95 3.74 2.64
8 8 16 2.83 2.08 4 2.77
9 9 18 3 2.20 4.24 2.89
10 10 20 3.16 2.30 4.47 3.00