Home > Mobile >  R Error: Error in glmnet(x, y, weights = weights, offset = offset, lambda = lambda,
R Error: Error in glmnet(x, y, weights = weights, offset = offset, lambda = lambda,

Time:10-14

I'm creating a lasso model and keep getting errors. Please help me solve:

library(plotmo)
x=model.matrix(Salary~.-1,data=Hitters) 
y=Hitters$Salary
cv.lasso=cv.glmnet(x,y)
plot(cv.lasso, label=5)
coef(cv.lasso)
lasso.tr=glmnet(x[train,],y[train])
pred=predict(lasso.tr,x[-train,])
dim(pred)
rmse= sqrt(apply((y[-train]-pred)^2,2,mean))
plot(log(lasso.tr$lambda),rmse,type="b",xlab="Log(lambda)")
lam.best=lasso.tr$lambda[order(rmse)[1]]
lam.best
coef(lasso.tr,s=lam.best)

Error is:

Error in glmnet(x, y, weights = weights, offset = offset, lambda = lambda, : number of observations in y (322) not equal to the number of rows of x (263)

CodePudding user response:

This error is because you have more y values than x values. If you look at the data, you can see that there are missing values for Salary. You should get rid of the observations that have missing Salary values:

library(plotmo)
Hitters = Hitters[!is.na(Hitters$Salary), ]
x=model.matrix(Salary~.-1,data=Hitters) 
y=Hitters$Salary
cv.lasso=cv.glmnet(x,y)
plot(cv.lasso, label=5)
coef(cv.lasso)
lasso.tr=glmnet(x[train,],y[train])
pred=predict(lasso.tr,x[-train,])
dim(pred)
rmse= sqrt(apply((y[-train]-pred)^2,2,mean))
plot(log(lasso.tr$lambda),rmse,type="b",xlab="Log(lambda)")

CodePudding user response:

First off, please always make sure your post is self-contained by explicitly stating any non-base R packages. This is particularly important when you refer to an external package for sample data (ISLR::Hitters).

The issue you're seeing has to do with observations containing missing values in ISLR::Hitters and how they are treated (i.e., omitted!) within model.matrix (see e.g. enter image description here

  • Related