Home > Software engineering >  Returning actual function argument from vector in function call with lapply in R
Returning actual function argument from vector in function call with lapply in R

Time:07-23

Please consider the following.

I want to use lapply() to subsequently apply several function arguments stored in a character vector to some other function. A minimal reproducible example could be to apply two or more "families" to the glm() function. Please note that the example might be nonsensical for applying such families and is used for illustration purposes only.

The following is taken from the example in ?glm()

counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
data.frame(treatment, outcome, counts) # showing data

We can now run a GLM with family "gaussian" or "poisson"

glm(counts ~ outcome   treatment, family = "gaussian")
glm(counts ~ outcome   treatment, family = "poisson")

This could also be "automated" by creating a character vector with these family names:

families <- c("poisson", "gaussian")

And using this in an lapply() function.

But once this runs, the returned function call does not return the family names anymore but the anonymous function argument x.

lapply(families, function(x) glm(counts ~ outcome   treatment, family = x))
#> [[1]]
#> 
#> Call:  glm(formula = counts ~ outcome   treatment, family = x)
#> 
#> Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>   3.045e 00   -4.543e-01   -2.930e-01   -3.242e-16   -2.148e-16  
#> 
#> Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
#> Null Deviance:       10.58 
#> Residual Deviance: 5.129     AIC: 56.76
#> 
#> [[2]]
#> 
#> Call:  glm(formula = counts ~ outcome   treatment, family = x)
#> 
#> Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>   2.100e 01   -7.667e 00   -5.333e 00    2.221e-15    2.971e-15  
#> 
#> Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
#> Null Deviance:       176 
#> Residual Deviance: 83.33     AIC: 57.57

Question: How can the family names from the vector families be preserved/shown in the function call after lapply()?


Desired outcome: The outcome should look like this:

#> [[1]]
#> 
#> Call:  glm(formula = counts ~ outcome   treatment, family = "gaussian")
#> 
#> Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>   3.045e 00   -4.543e-01   -2.930e-01   -3.242e-16   -2.148e-16  
#> 
#> Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
#> Null Deviance:       10.58 
#> Residual Deviance: 5.129     AIC: 56.76
#> 
#> [[2]]
#> 
#> Call:  glm(formula = counts ~ outcome   treatment, family = "poisson")
#> 
#> Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>   2.100e 01   -7.667e 00   -5.333e 00    2.221e-15    2.971e-15  
#> 
#> Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
#> Null Deviance:       176 
#> Residual Deviance: 83.33     AIC: 57.57

I tried eval(bquote(x)) as suggested here: R: Passing named function arguments from vector, but this did not work. See:

lapply(families, function(x) glm(counts ~ outcome   treatment, family = eval(bquote(x))))
#> [[1]]
#> 
#> Call:  glm(formula = counts ~ outcome   treatment, family = eval(bquote(x)))
#> 
#> Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>   3.045e 00   -4.543e-01   -2.930e-01   -3.242e-16   -2.148e-16  
#> 
#> Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
#> Null Deviance:       10.58 
#> Residual Deviance: 5.129     AIC: 56.76
#> 
#> [[2]]
#> 
#> Call:  glm(formula = counts ~ outcome   treatment, family = eval(bquote(x)))
#> 
#> Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>   2.100e 01   -7.667e 00   -5.333e 00    2.221e-15    2.971e-15  
#> 
#> Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
#> Null Deviance:       176 
#> Residual Deviance: 83.33     AIC: 57.57

Created on 2022-07-22 by the reprex package (v2.0.1)

Thank you!

CodePudding user response:

An approach could be to extract the family name and add it to the formula within each model object. For instance like this:

lapply(families, \(fam) { model <- glm(counts ~ outcome   treatment, family = fam); model$call[3] <- model$family$family; return(model)})

Output:

[[1]]

Call:  glm(formula = counts ~ outcome   treatment, family = "poisson")

Coefficients:
(Intercept)     outcome2     outcome3   treatment2   treatment3  
  3.045e 00   -4.543e-01   -2.930e-01   -3.242e-16   -2.148e-16  

Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
Null Deviance:      10.58 
Residual Deviance: 5.129    AIC: 56.76

[[2]]

Call:  glm(formula = counts ~ outcome   treatment, family = "gaussian")

Coefficients:
(Intercept)     outcome2     outcome3   treatment2   treatment3  
  2.100e 01   -7.667e 00   -5.333e 00    2.221e-15    2.971e-15  

Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
Null Deviance:      176 
Residual Deviance: 83.33    AIC: 57.57

Depending on the purpose, you could also (just) name your elements in your vector and each list element would have its name.

families <- c(poisson = "poisson", gaussian = "gaussian")
lapply(families, function(x) glm(counts ~ outcome   treatment, family = x))

Output:

$poisson

Call:  glm(formula = counts ~ outcome   treatment, family = x)

Coefficients:
(Intercept)     outcome2     outcome3   treatment2   treatment3  
  3.045e 00   -4.543e-01   -2.930e-01   -3.242e-16   -2.148e-16  

Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
Null Deviance:      10.58 
Residual Deviance: 5.129    AIC: 56.76

$gaussian

Call:  glm(formula = counts ~ outcome   treatment, family = x)

Coefficients:
(Intercept)     outcome2     outcome3   treatment2   treatment3  
  2.100e 01   -7.667e 00   -5.333e 00    2.221e-15    2.971e-15  

Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
Null Deviance:      176 
Residual Deviance: 83.33    AIC: 57.57

Update with approach 1.

CodePudding user response:

A more direct way to do this would be to build and evaluate the call directly inside lapply

lapply(families, function(x) {
  eval(as.call(list(quote(glm), 
               formula = counts ~ outcome   treatment, 
               data = quote(df), 
               family = x)))
  })
#> [[1]]
#> 
#> Call:  glm(formula = counts ~ outcome   treatment, family = "poisson", 
#>     data = df)
#> 
#> Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>   3.045e 00   -4.543e-01   -2.930e-01    1.338e-15    1.421e-15  
#> 
#> Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
#> Null Deviance:       10.58 
#> Residual Deviance: 5.129     AIC: 56.76
#> 
#> [[2]]
#> 
#> Call:  glm(formula = counts ~ outcome   treatment, family = "gaussian", 
#>     data = df)
#> 
#> Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>   2.100e 01   -7.667e 00   -5.333e 00    2.056e-16    7.252e-16  
#> 
#> Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
#> Null Deviance:       176 
#> Residual Deviance: 83.33     AIC: 57.57

Created on 2022-07-22 by the reprex package (v2.0.1)

CodePudding user response:

Yet another possible solution:

families <- c("gaussian", "poisson") 
lapply(families, \(x) eval(parse(text=paste0("glm(counts ~ outcome   treatment, 
  df, family = ", x, ")"))))

#> [[1]]
#> 
#> Call:  glm(formula = counts ~ outcome   treatment, family = gaussian, 
#>     data = df)
#> 
#> Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>   2.100e 01   -7.667e 00   -5.333e 00    8.729e-16    7.252e-16  
#> 
#> Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
#> Null Deviance:       176 
#> Residual Deviance: 83.33     AIC: 57.57
#> 
#> [[2]]
#> 
#> Call:  glm(formula = counts ~ outcome   treatment, family = poisson, 
#>     data = df)
#> 
#> Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>   3.045e 00   -4.543e-01   -2.930e-01    1.011e-15    7.105e-16  
#> 
#> Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
#> Null Deviance:       10.58 
#> Residual Deviance: 5.129     AIC: 56.76
  • Related