I have data for which I would like to make a summary by group using the summary_by function (from the doBy package). I can't use the column names in the summary_by formula but variables I created before.
Below is the result I would like to achieve :
library(data.table)
library(doBy)
mtcars = data.table(mtcars)
doBy::summary_by(data = mtcars, mpg ~ gear am, FUN = "mean")
output:
gear am mpg."mean"
3 0 16.10667
4 0 21.05000
4 1 26.27500
5 1 21.38000
Here is what I want to do :
library(data.table)
library(doBy)
mtcars = data.table(mtcars)
variable1 = "gear" # which is a column name of mtcars
variable2 = "am" # which is a column name of mtcars
variable3 = "mpg" # which is a column name of mtcars
doBy::summary_by(data = mtcars, variable3 ~ variable1 variable2 , FUN = "mean")
I tried with the functions get, assign, eval, mget but I don't find the solution.
CodePudding user response:
Just provide a string instead of a formula that relies on non-standard evaluation.
library(data.table)
library(doBy)
mtcars = data.table(mtcars)
variable1 = "gear" # which is a column name of mtcars
variable2 = "am" # which is a column name of mtcars
variable3 = "mpg" # which is a column name of mtcars
doBy::summary_by(data = mtcars,
# alternatively to sprintf(), use paste() oder glue()
as.formula(sprintf("%s ~ %s %s", variable3, variable1, variable2)),
FUN = "mean")
CodePudding user response:
Thanks @mnist it works !!
I just find 2 other ways :
library(data.table)
library(doBy)
mtcars = data.table(mtcars)
variable1 = "gear" # which is a column name of mtcars
variable2 = "am" # which is a column name of mtcars
variable3 = "mpg" # which is a column name of mtcars
Summary_by solution with reformulate function :
summary_by(data = mtcars, reformulate( termlabels = c(variable1, variable2), response = variable3) )
Datatable native way :
mtcars[, mean(get(variable3)), by = mget(c(variable1, variable2))]