Assuing I have a table with several variables, a
- h
, where h
is the target/y/predicted variable:
a <- rnorm(10,5,1)
b <- rnorm(10,5,1)
c <- rnorm(10,5,1)
d <- rnorm(10,5,1)
e <- rnorm(10,5,1)
f <- rnorm(10,5,1)
g <- rnorm(10,5,1)
h <- rnorm(10,5,1)
df = data.frame(a,b,c,d,e,f,g,h)
I want to run the AIC to determine the best possible model for predicting h
. To do that, I need to run every single combination of df[1:7]
. So I'd need the AICs of:
lm(fomula= h ~ a b c d e f g)
lm(fomula= h ~ a b c d e f)
lm(fomula= h ~ a b c d e)
As well as every other configuration of the variables. Is there any way I can do this please?
To get every possible formulation of the variables I've tried:
library(combinat)
combn(colnames(df[,1:7]))
However, I only got:
[1] "a" "b" "c" "d" "e" "f" "g"
As the output of the above code which is a far cry from what I ultimately want.
CodePudding user response:
use the step
function. This should give you the best model:
step(lm(h~., df),direction = 'both', trace = 0)
Call:
lm(formula = h ~ b e f, data = df)
Coefficients:
(Intercept) b e f
4.3494 -0.8705 -0.3266 1.2877
This model has the lowest AIC
. You can change trace = 1
, to look at the intermediate models that were run