Please consider the following.
I want to use lapply()
to subsequently apply several function arguments stored in a character vector to some other function. A minimal reproducible example could be to apply two or more "families" to the glm()
function. Please note that the example might be nonsensical for applying such families and is used for illustration purposes only.
The following is taken from the example in ?glm()
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
data.frame(treatment, outcome, counts) # showing data
We can now run a GLM with family "gaussian" or "poisson"
glm(counts ~ outcome treatment, family = "gaussian")
glm(counts ~ outcome treatment, family = "poisson")
This could also be "automated" by creating a character vector with these family names:
families <- c("poisson", "gaussian")
And using this in an lapply()
function.
But once this runs, the returned function call does not return the family names anymore but the anonymous function argument x
.
lapply(families, function(x) glm(counts ~ outcome treatment, family = x))
#> [[1]]
#>
#> Call: glm(formula = counts ~ outcome treatment, family = x)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.045e 00 -4.543e-01 -2.930e-01 -3.242e-16 -2.148e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 10.58
#> Residual Deviance: 5.129 AIC: 56.76
#>
#> [[2]]
#>
#> Call: glm(formula = counts ~ outcome treatment, family = x)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 2.100e 01 -7.667e 00 -5.333e 00 2.221e-15 2.971e-15
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 176
#> Residual Deviance: 83.33 AIC: 57.57
Question:
How can the family names from the vector families
be preserved/shown in the function call after lapply()
?
Desired outcome: The outcome should look like this:
#> [[1]]
#>
#> Call: glm(formula = counts ~ outcome treatment, family = "gaussian")
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.045e 00 -4.543e-01 -2.930e-01 -3.242e-16 -2.148e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 10.58
#> Residual Deviance: 5.129 AIC: 56.76
#>
#> [[2]]
#>
#> Call: glm(formula = counts ~ outcome treatment, family = "poisson")
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 2.100e 01 -7.667e 00 -5.333e 00 2.221e-15 2.971e-15
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 176
#> Residual Deviance: 83.33 AIC: 57.57
I tried eval(bquote(x))
as suggested here: R: Passing named function arguments from vector, but this did not work. See:
lapply(families, function(x) glm(counts ~ outcome treatment, family = eval(bquote(x))))
#> [[1]]
#>
#> Call: glm(formula = counts ~ outcome treatment, family = eval(bquote(x)))
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.045e 00 -4.543e-01 -2.930e-01 -3.242e-16 -2.148e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 10.58
#> Residual Deviance: 5.129 AIC: 56.76
#>
#> [[2]]
#>
#> Call: glm(formula = counts ~ outcome treatment, family = eval(bquote(x)))
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 2.100e 01 -7.667e 00 -5.333e 00 2.221e-15 2.971e-15
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 176
#> Residual Deviance: 83.33 AIC: 57.57
Created on 2022-07-22 by the reprex package (v2.0.1)
Thank you!
CodePudding user response:
An approach could be to extract the family name and add it to the formula within each model object. For instance like this:
lapply(families, \(fam) { model <- glm(counts ~ outcome treatment, family = fam); model$call[3] <- model$family$family; return(model)})
Output:
[[1]]
Call: glm(formula = counts ~ outcome treatment, family = "poisson")
Coefficients:
(Intercept) outcome2 outcome3 treatment2 treatment3
3.045e 00 -4.543e-01 -2.930e-01 -3.242e-16 -2.148e-16
Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
Null Deviance: 10.58
Residual Deviance: 5.129 AIC: 56.76
[[2]]
Call: glm(formula = counts ~ outcome treatment, family = "gaussian")
Coefficients:
(Intercept) outcome2 outcome3 treatment2 treatment3
2.100e 01 -7.667e 00 -5.333e 00 2.221e-15 2.971e-15
Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
Null Deviance: 176
Residual Deviance: 83.33 AIC: 57.57
Depending on the purpose, you could also (just) name your elements in your vector and each list element would have its name.
families <- c(poisson = "poisson", gaussian = "gaussian")
lapply(families, function(x) glm(counts ~ outcome treatment, family = x))
Output:
$poisson
Call: glm(formula = counts ~ outcome treatment, family = x)
Coefficients:
(Intercept) outcome2 outcome3 treatment2 treatment3
3.045e 00 -4.543e-01 -2.930e-01 -3.242e-16 -2.148e-16
Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
Null Deviance: 10.58
Residual Deviance: 5.129 AIC: 56.76
$gaussian
Call: glm(formula = counts ~ outcome treatment, family = x)
Coefficients:
(Intercept) outcome2 outcome3 treatment2 treatment3
2.100e 01 -7.667e 00 -5.333e 00 2.221e-15 2.971e-15
Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
Null Deviance: 176
Residual Deviance: 83.33 AIC: 57.57
Update with approach 1.
CodePudding user response:
A more direct way to do this would be to build and evaluate the call directly inside lapply
lapply(families, function(x) {
eval(as.call(list(quote(glm),
formula = counts ~ outcome treatment,
data = quote(df),
family = x)))
})
#> [[1]]
#>
#> Call: glm(formula = counts ~ outcome treatment, family = "poisson",
#> data = df)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.045e 00 -4.543e-01 -2.930e-01 1.338e-15 1.421e-15
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 10.58
#> Residual Deviance: 5.129 AIC: 56.76
#>
#> [[2]]
#>
#> Call: glm(formula = counts ~ outcome treatment, family = "gaussian",
#> data = df)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 2.100e 01 -7.667e 00 -5.333e 00 2.056e-16 7.252e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 176
#> Residual Deviance: 83.33 AIC: 57.57
Created on 2022-07-22 by the reprex package (v2.0.1)
CodePudding user response:
Yet another possible solution:
families <- c("gaussian", "poisson")
lapply(families, \(x) eval(parse(text=paste0("glm(counts ~ outcome treatment,
df, family = ", x, ")"))))
#> [[1]]
#>
#> Call: glm(formula = counts ~ outcome treatment, family = gaussian,
#> data = df)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 2.100e 01 -7.667e 00 -5.333e 00 8.729e-16 7.252e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 176
#> Residual Deviance: 83.33 AIC: 57.57
#>
#> [[2]]
#>
#> Call: glm(formula = counts ~ outcome treatment, family = poisson,
#> data = df)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.045e 00 -4.543e-01 -2.930e-01 1.011e-15 7.105e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 10.58
#> Residual Deviance: 5.129 AIC: 56.76