I'm trying to predict for the model, yet it is showing an error:
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) :
variable lengths differ (found for 'Welfare.Measurment')
The test and train data are similar, same name and structure of variables. I even tried to rbind
the two data frames, but the error persists.
Here is the code:
model3 <- lm(log(Poverty.Line.Day) ~ (log(data_abs$Median))
Welfare.Measurment Control, data=data_abs)
predicted_poverty_Line <-
exp(predict(model3, dataF))*exp((summary(model3)$sigma)^2/2)
CodePudding user response:
In lm
, do not use the $
in formula when using data=
argument.
fit1 <- lm(y ~ train$X1 X2, data=train) ## predict will fail
predict(fit1, newdata=test)
# Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = obje
# ct$xlevels) :
# variable lengths differ (found for 'X2')
fit2 <- lm(y ~ X1 X2, data=train) ## predict will work
predict(fit2, newdata=test)
Reason: If you use e.g. train$X1
in the formula, the variable will be fixed, and even if you provide newdata=
in predict
, the old values will be used. If the vector is not accidentally of same length, you will get this error.
Data:
n <- 60
set.seed(42)
dat <- data.frame(X1=rnorm(n), X2=rnorm(n))
dat <- transform(dat, y=1 X1 rnorm(n))
train <- dat[1:20, ]
test <- dat[21:n, ]