I have a ranger object from the tidymodels rand_forest function:
rf <- rand_forest(mode = "regression", trees = 1000) %>% fit(pay_rate ~ age profession)
I want to get the feature importance of each variable (I have many more than in this example). I've tried things like rf$variable.importance
, or importance(rf)
, but the former returns NULL
and the latter function doesn't exist. I tried using the vip
package, but that doesn't work for a ranger object. How can I extract feature importances from this object?
CodePudding user response:
You need to add importance = "impurity"
when you set the engine for ranger. This will provide variable importance scores. Once this is set, you can use extract_fit_parsnip
with vip
to plot the variable importance.
small example:
library(tidymodels)
library(vip)
rf_mod <- rand_forest(mode = "regression", trees = 100) %>%
set_engine("ranger", importance = "impurity")
rf_recipe <-
recipe(mpg ~ ., data = mtcars)
rf_workflow <-
workflow() %>%
add_model(rf_mod) %>%
add_recipe(rf_recipe)
rf_workflow %>%
fit(mtcars) %>%
extract_fit_parsnip() %>%
vip(num_features = 10)
More information is available in the tidymodels get started guide