I am trying to train an linear model in R, and at first i kept recieving an error of add column names to X, and when i fixed that by adding data.frame, i recieved a new error Error: unexpected symbol in: "meatPCR <- train(as.formula(x = absorpTrain, y = proteinTrain,method = "pcr",trControl = ctrl, tuneLength = 25) meatPLS"
It appears as though it is including my next line of code "set.seed(529(" as part of my train code line. can someone please explain this, and how do i fix it?
library(AppliedPredictiveModeling)
library(ggplot2)
library(lattice)
library(caret)
library(e1071)
library(corrplot)
library(lattice)
install.packages("reshape")
library(reshape)
data(tecator)
set.seed(1)
inSubset <- sample(1:dim(endpoints)[1], 10)
absorpSubset <- absorp[inSubset,]
endpointSubset <- endpoints[inSubset, 3]
newOrder <- order(-absorpSubset[,1])
absorpSubset <- absorpSubset[newOrder,]
endpointSubset <- endpointSubset[newOrder]
spectData <- as.data.frame(t(absorpSubset))
spectData$x <- 1:nrow(spectData)
spectData2 <- melt(spectData, id.vars = c("x"))
cols <- brewer.pal(9,"YlOrRd")[-(1:2)]
cols <- colorRampPalette(cols)(10)
spectTheme <- caretTheme()
spectTheme$superpose.line$col <- cols
spectTheme$superpose.line$lwd <- rep(2, 10)
spectTheme$superpose.line$lty <- rep(1, 10)
trellis.par.set(spectTheme)
xyplot(value ~ x, data = spectData2,groups = variable,type = c("l", "g"),panel = function(...) {panel.xyplot(...)panel.text(rep(103.5, nrow(absorpSubset)),absorpSubset[,ncol(absorpSubset)], paste(endpointSubset),cex = .7)},ylab = "Absorption",xlab = "")
\caption[Tecator spectrum plots]{A sample of 10 spectrum of the Tecator data. The colors of the curves reflect the absorption values, where yellow indicates low absorption and red is indicative of high absorption.}
\label{F:RegressionTecatorSpectrum}
\end{center}
\end{figure}
\subsection*{Solutions}
pcaObj <- prcomp(absorp, center = TRUE, scale = TRUE)
pctVar <- pcaObj$sdev^2/sum(pcaObj$sdev^2)*100
head(pctVar)
set.seed(1029)
inMeatTraining <- createDataPartition(endpoints[, 3], p = 3/4, list= FALSE)
absorpTrain <- absorp[ inMeatTraining,]
absorpTest <- absorp[-inMeatTraining,]
proteinTrain <- endpoints[ inMeatTraining, 3]
proteinTest <- endpoints[-inMeatTraining,3]
ctrl <- trainControl(method = "repeatedcv", repeats = 5)
set.seed(529)
mealLm<-train(as.formula(y~x, data=test, method="lm", trControl = trainControl)
set.seed(529)
meatPCR <- train(as.formula(x = absorpTrain, y = proteinTrain,method = "pcr",trControl = ctrl, tuneLength = 25)
set.seed(529)
meatPLS <- train(y,x, method = "pls", trControl = ctrl, preProcess = c("center", "scale"),tuneLength = 25)
comps <- rbind(meatPLS$results, meatPCR$results)
comps$Model <- rep(c("PLS", "PCR"), each = 25)
CodePudding user response:
you forgot to add a closing parenthesis in these lines, i am not sure where the parenthesis is supposed to be, by the are related to the 'as.formula'-call:
mealLm<-train(as.formula(y~x, data=test, method="lm", trControl = trainControl)
meatPCR <- train(as.formula(x = absorpTrain, y = proteinTrain,method = "pcr",trControl = ctrl, tuneLength = 25)
I hope that helps you fix your problem, and welcome to the site :)
CodePudding user response:
The enclosing needs to be:
mealLm<-train(y ~ ., data=train, method="lm", trControl = trainControl)
You don't need the as formula in this case, as you already have variables for the y and you can use the period to say everything else in the frame.