I want to calculate an "Auto" VIF for variables, one vs the others. For example in the iris dataset, I want Sepal.Width to act as target value and the others as explanatory variables.
First I remove the Species
column, so only the variables stay. Then I want to loop over each variable and test it again the others. Finally I want the VIF resulto to be stored in a list.
This is what I have tried:
library(car)
library(dplyr)
library(tidyr)
iris_clean <- iris %>%
select(-Species)
col_names <- colnames(iris)
i <- 1
for(col in col_names) {
regr <- lm(col ~ ., data=iris_clean)
list[i] <- vif(regr)
i <- i 1
}
For some reason I get an error:
Error in model.frame.default(formula = col ~ ., data = iris_clean, drop.unused.levels = TRUE) :
variable lengths differ (found for 'Sepal.Length')
Which I don't understand because the variables have the same length. Please, any help will be greatly appreciated.
CodePudding user response:
You can try this -
library(car)
library(dplyr)
library(tidyr)
iris_clean <- iris %>% select(-Species)
col_names <- colnames(iris_clean)
result <- vector('list', length(col_names))
for(i in seq_along(col_names)) {
regr <- lm(paste0(col_names[i], '~ .'), data=iris_clean)
result[[i]] <- vif(regr)
}
result
#[[1]]
# Sepal.Width Petal.Length Petal.Width
# 1.270815 15.097572 14.234335
#[[2]]
#Sepal.Length Petal.Length Petal.Width
# 4.278282 19.426391 14.089441
#[[3]]
#Sepal.Length Sepal.Width Petal.Width
# 3.415733 1.305515 3.889961
#[[4]]
#Sepal.Length Sepal.Width Petal.Length
# 6.256954 1.839639 7.557780