I am trying to iterate over multiple columns for a glm function in R.
view(mtcars)
names <- names(mtcars[-c(1,2)])
for(i in 1:length(names)){
print(paste0("Starting iterations for ",names[i]))
model <- glm(mpg ~ cyl paste0(names[i]), data=mtcars, family = gaussian())
summary(model)
print(paste0("Iterations for ",names[i], " finished"))
}
however, I am getting the following error:
[1] "Starting iterations for disp"
Error in model.frame.default(formula = mpg ~ cyl paste0(names[i]), data = mtcars, :
variable lengths differ (found for 'paste0(names[i])')
Not sure, how I can correct this.
CodePudding user response:
mpg ~ cyl paste0(names[i])
or even mpg ~ cyl names[i]
is not a valid syntax for a formula. Use
reformulate(c("cyl", names[i]), "mpg")
instead, which dynamically creates a formula from variable names.
CodePudding user response:
Since you need to build your model formula dynamically from string you need as.formula
. Alternatively, consider reformulate
which receives response and RHS variable names:
...
fml <- reformulate(c("cyl", names[i]), "mpg")
model <- glm(fml, data=mtcars, family = gaussian())
summary(model)
...
CodePudding user response:
glm takes a formula which you can create using as.formula()
predictors <- names(mtcars[-c(1,2)])
for(predictor in predictors){
print(paste0("Starting iterations for ",predictor))
model <- glm(as.formula(paste0("mpg ~ cyl ",predictor)),
data=mtcars,
family = gaussian())
print(summary(model))
print(paste0("Iterations for ",predictor, " finished"))
}