Home > Net >  Iterating and looping over multiple columns in glm in r using a name from another variable
Iterating and looping over multiple columns in glm in r using a name from another variable

Time:07-17

I am trying to iterate over multiple columns for a glm function in R.

view(mtcars)
names <- names(mtcars[-c(1,2)])

for(i in 1:length(names)){
  
  print(paste0("Starting iterations for ",names[i]))
  
  
  model <-  glm(mpg ~ cyl   paste0(names[i]), data=mtcars, family = gaussian())
  summary(model)
  
  print(paste0("Iterations for ",names[i], " finished"))
}

however, I am getting the following error:

[1] "Starting iterations for disp"
Error in model.frame.default(formula = mpg ~ cyl   paste0(names[i]), data = mtcars,  : 
  variable lengths differ (found for 'paste0(names[i])')

Not sure, how I can correct this.

CodePudding user response:

mpg ~ cyl paste0(names[i]) or even mpg ~ cyl names[i] is not a valid syntax for a formula. Use

reformulate(c("cyl", names[i]), "mpg")

instead, which dynamically creates a formula from variable names.

CodePudding user response:

Since you need to build your model formula dynamically from string you need as.formula. Alternatively, consider reformulate which receives response and RHS variable names:

...
    fml <- reformulate(c("cyl", names[i]), "mpg")
    model <-  glm(fml, data=mtcars, family = gaussian())
    summary(model)
...

CodePudding user response:

glm takes a formula which you can create using as.formula()

predictors <- names(mtcars[-c(1,2)])

for(predictor in predictors){
  
  print(paste0("Starting iterations for ",predictor))
  
  model <-  glm(as.formula(paste0("mpg ~ cyl   ",predictor)), 
                           data=mtcars, 
                           family = gaussian())
  print(summary(model))
  
  print(paste0("Iterations for ",predictor, " finished"))
}
  • Related