Home > Back-end >  Error : 'data' must be a data.frame, environment, or list
Error : 'data' must be a data.frame, environment, or list

Time:06-03

#define training and testing sets
set.seed(555)
train <- df2[1:800, c("charges")]
y_test <- df2[801:nrow(df2), c("charges")]
test <- df2[801:nrow(df2), c("age","bmi","children","smoker")]
   
#use model to make predictions on a test set
model <- pcr(charges~age bmi children smoker, data = train, scale=TRUE, validation="CV")
pcr_pred <- predict(model, test, ncomp = 4)

#calculate RMSE
sqrt(mean((pcr_pred - y_test)^2))

I dont know why i get this error... already tried number of things but still stuck here

CodePudding user response:

When you executed:

train <- df2[1:800, c("charges")]

You created an R atomic character vector. The class of the result would not be a list unless you also added the drop=FALSE parameter:

train <- df2[1:800, c("charges"), drop=FALSE]

That should fix that error although the lack of any data prevents any of us from determining whether further errors might arise. Actually, I'm pretty sure you did not want that train object to be just a single column since your model obviously expected other columns. Try this instead:

set.seed(555)
train <- df2[1:800, ]
test <- df2[801:nrow(df2), ]
   
#use model to make predictions on a test set
model <- pcr(charges~age bmi children smoker, data = train, scale=TRUE, validation="CV")
pcr_pred <- predict(model, test, ncomp = 4)

#calculate RMSE
sqrt(mean((pcr_pred - y_test)^2))
  •  Tags:  
  • r
  • Related