Home > Enterprise >  Why I'm geeting an error in predict.lm "variable lengths differ"?
Why I'm geeting an error in predict.lm "variable lengths differ"?

Time:04-20

I'm trying to predict for the model, yet it is showing an error:

Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : 
  variable lengths differ (found for 'Welfare.Measurment')

The test and train data are similar, same name and structure of variables. I even tried to rbind the two data frames, but the error persists.

Here is the code:

model3 <- lm(log(Poverty.Line.Day) ~ (log(data_abs$Median))   
              Welfare.Measurment   Control, data=data_abs)

predicted_poverty_Line <- 
  exp(predict(model3, dataF))*exp((summary(model3)$sigma)^2/2)

CodePudding user response:

In lm, do not use the $ in formula when using data= argument.

fit1 <- lm(y ~ train$X1   X2, data=train)  ## predict will fail
predict(fit1, newdata=test)
# Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = obje
# ct$xlevels) : 
#   variable lengths differ (found for 'X2')

fit2 <- lm(y ~ X1   X2, data=train)  ## predict will work
predict(fit2, newdata=test)

Reason: If you use e.g. train$X1 in the formula, the variable will be fixed, and even if you provide newdata= in predict, the old values will be used. If the vector is not accidentally of same length, you will get this error.


Data:

n <- 60
set.seed(42)
dat <- data.frame(X1=rnorm(n), X2=rnorm(n))
dat <- transform(dat, y=1   X1   rnorm(n))
train <- dat[1:20, ]
test <- dat[21:n, ]
  •  Tags:  
  • r lm
  • Related