Home > Mobile >  How do I change predictors in linear regression in loop in R?
How do I change predictors in linear regression in loop in R?

Time:03-06

How do I change predictors in linear regression in loop in R?

Below is an example along with the error. Can someone please fix it.

# sample data 
mpg <- mpg

str(mpg)

# array of predictors
predictors <- c("hwy", "cty")

# loop over predictors
for (predictor in predictors) 
{
  # fit linear regression
  model <- lm(formula = predictor ~ displ   cyl,
              data = mpg)
  
  # summary of model
  summary(model)
}

Error

Error in model.frame.default(formula = predictor ~ displ   cyl, data = mpg,  : 
  variable lengths differ (found for 'displ')

CodePudding user response:

We may use paste or reformulate. Also, as it is a for loop, create an object to store the output from summary

sumry_model <- vector('list', length(predictors))
names(sumry_model) <- predictors
for (predictor in predictors) {
  # fit linear regression
  model <- lm(reformulate(c("displ", "cyl"), response = predictor),
              data = mpg)
  # with paste
  # model <- lm(formula = paste0(predictor, "~ displ   cyl"), data = mpg)
  
  # summary of model
    sumry_model[[predictor]] <- summary(model)
}

-output

> sumry_model
$hwy

Call:
lm(formula = reformulate(c("displ", "cyl"), response = predictor), 
    data = mpg)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.5098 -2.1953 -0.2049  1.9023 14.9223 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  38.2162     1.0481  36.461  < 2e-16 ***
displ        -1.9599     0.5194  -3.773 0.000205 ***
cyl          -1.3537     0.4164  -3.251 0.001323 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.759 on 231 degrees of freedom
Multiple R-squared:  0.6049,    Adjusted R-squared:  0.6014 
F-statistic: 176.8 on 2 and 231 DF,  p-value: < 2.2e-16


$cty

Call:
lm(formula = reformulate(c("displ", "cyl"), response = predictor), 
    data = mpg)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.9276 -1.4750 -0.0891  1.0686 13.9261 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  28.2885     0.6876  41.139  < 2e-16 ***
displ        -1.1979     0.3408  -3.515 0.000529 ***
cyl          -1.2347     0.2732  -4.519 9.91e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.466 on 231 degrees of freedom
Multiple R-squared:  0.6671,    Adjusted R-squared:  0.6642 
F-statistic: 231.4 on 2 and 231 DF,  p-value: < 2.2e-16

This may be also done as a multivariate response

summary(lm(cbind(hwy, cty) ~ displ   cyl, data = mpg))

Or if we want to use predictors

summary(lm(as.matrix(mpg[predictors]) ~ displ   cyl, data = mpg))
  •  Tags:  
  • r lm
  • Related