I'm rolling a loop over the degree of the approximating polynomial for training with caret
ds = 1:20
for(i in 1:length(ds)){
print(i)
d=ds[i]
fit = train(y~poly(x,degree=d),data=training,method="lm",trControl=fitCtrl)
# other operations
}
running the code gives
Error in `[.data.frame`(data, 0, cols, drop = FALSE) :
undefined columns selected
using d=4 doesn't work, but fixing the degree in the call, i.e. degree=4, works.
Any guess of what's going on here?
Thanks!
EDIT:
library(caret)
set.seed(1)
omega = 0.5*pi
xi = 0.5
phi = 0.5*pi
f = function(t)1-exp(-xi*omega*t)*sin(sqrt(1-xi^2)*omega*t phi)/sin(phi)
sigma = 0.03
train.n = 100
x = seq(0,2*pi,by=2*pi/(train.n-1))
y = f(x) rnorm(train.n,mean=0,sd=sigma)
training = data.frame(x=x,y=y)
fitCtrl <- trainControl(method = "LOOCV",verboseIter = FALSE)
ds = 1:20
for(i in 1:length(ds)){
print(i)
d=4
fit=train(y~poly(x,degree=4),data=training,method="lm",trControl=fitCtrl)
}
CodePudding user response:
Formulas will always look for variables such as d
in the data, just as it does for y
and x
here.
To make R interpret the d
as a number, wrap it in I()
.
Reproducable example using mtcars
:
library(dplyr, warn.conflicts = FALSE) # For the pipe
# Only showing two iterations to illustrate that code is working
ds <- 1:2
for(i in 1:length(ds)){
d <- ds[i]
lm(mpg~poly(hp, degree = I(d)), data = mtcars) %>%
coefficients() %>%
print()
}
#> (Intercept) poly(hp, degree = I(d))
#> 20.09062 -26.04559
#> (Intercept) poly(hp, degree = I(d))1 poly(hp, degree = I(d))2
#> 20.09062 -26.04559 13.15457
Created on 2022-03-19 by the reprex package (v2.0.1)
CodePudding user response:
We may use paste
to create the formula here
d <- 4
train(as.formula(paste0('y ~ poly(x, degree =', d, ')')),
data = training, method = "lm", trControl = fitCtrl)
-output
Linear Regression
100 samples
1 predictor
No pre-processing
Resampling: Leave-One-Out Cross-Validation
Summary of sample sizes: 99, 99, 99, 99, 99, 99, ...
Resampling results:
RMSE Rsquared MAE
0.03790195 0.9779768 0.02937452
With the loop, we may need to store the output in a list
ds <- 1:20
fitlst <- vector('list', length(ds))
for(i in seq_along(ds)){
print(i)
d <- ds[i]
fitlst[[i]] <- train(as.formula(paste0('y ~ poly(x, degree =', d, ')')),
data = training, method = "lm", trControl = fitCtrl)
}
-output
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
[1] 11
[1] 12
[1] 13
[1] 14
[1] 15
[1] 16
[1] 17
[1] 18
[1] 19
[1] 20
> fitlst[[4]]
Linear Regression
100 samples
1 predictor
No pre-processing
Resampling: Leave-One-Out Cross-Validation
Summary of sample sizes: 99, 99, 99, 99, 99, 99, ...
Resampling results:
RMSE Rsquared MAE
0.03790195 0.9779768 0.02937452
Tuning parameter 'intercept' was held constant at a value of TRUE