I'm having issues with dplyr::mutate(). It works fine when I use it regularly, but it throws an error when I use it in a function and then try to define that function.
My goal is to create a function that lets you calculate the quartiles of a distribution for a whole data set, as well as for grouped subsets of that data set.
For example:
This works fine
dat <- mtcars
test_table <- dat %>%
bind_rows(mutate(., cyl = "all")) %>%
group_by(cyl) %>%
summarise(mpg_q25 = quantile(mpg, prob = .25),
mpg_q50 = quantile(mpg, prob = .50),
mpg_q75 = quantile(mpg, prob = .75),
count = n())
test_table
Output
# A tibble: 4 × 5
cyl mpg_q25 mpg_q50 mpg_q75 count
<chr> <dbl> <dbl> <dbl> <int>
1 4 22.8 26 30.4 11
2 6 18.6 19.7 21 7
3 8 14.4 15.2 16.2 14
4 all 15.4 19.2 22.8 32
However, this does not
mpg_table <- function(df, grouping_var, val) {
bind_rows(mutate(., {{grouping_var}} = "all")) %>%
group_by({{grouping_var}}) %>%
summarise(mpgq25 = quantile({{mpg}}, prob = .25),
mpgq50 = quantile({{mpg}}, prob = .50),
mpgq75 = quantile({{mpg}}, prob = .75),
count = n())
}
mpg_table(dat, cyl, mpg)
Output from trying to define the function
Error: unexpected '=' in:
"mpg_table <- function(df, grouping_var, val) {
bind_rows(mutate(., {{grouping_var}} ="
Anyone have any idea what's going wrong here? Thank you!
CodePudding user response:
How about this:
library(dplyr)
library(glue)
data(mtcars)
dat <- mtcars
mpg_table <- function(df, grouping_var, val) {
df %>%
mutate({{grouping_var}} := as.character({{grouping_var}})) %>%
bind_rows(mutate(., {{grouping_var}} := "all")) %>%
group_by({{grouping_var}}) %>%
summarise("{{val}}q25" := quantile({{val}}, prob = .25),
"{{val}}q50" := quantile({{val}}, prob = .50),
"{{val}}q75" := quantile({{val}}, prob = .75),
count = n())
}
mpg_table(dat, cyl, mpg)
#> # A tibble: 4 × 5
#> cyl mpgq25 mpgq50 mpgq75 count
#> <chr> <dbl> <dbl> <dbl> <int>
#> 1 4 22.8 26 30.4 11
#> 2 6 18.6 19.7 21 7
#> 3 8 14.4 15.2 16.2 14
#> 4 all 15.4 19.2 22.8 32
Created on 2022-09-29 by the reprex package (v2.0.1)
The :=
allows you to pass a variable in as the name of a new variable to be created. I also used the same construct for the variable names for the quantiles. This means that if you pass drat
as val
for example, you would get dratq25
, dratq50
and dratq75
as the variables in the output.
The other problem you run into is a format problem. The cyl
variable is numeric and you're trying to bind it to a data frame whose cyl
variable is a character. The first step in the code above changes the grouping_var
to character to avoid this problem.