Using the iris dataset, a knn-classifier was tuned with iterative search and roc_auc as metric for the purpose of multiple classification.
One AUC result per potential model was calculated as expected, nevertheless, this value is not stable, but affected by:
- the order of
levels ("setosa", "virginica", "versicolor")
in the Species column in the initial dataset - the order of columns in the
roc_auc(truth = Species, .pred_setosa, .pred_virginica,.pred_versicolor)
Does this indicate that the AUC may be calculated similarly as setting the first level of the Species column as the positive event (which is expected in the binary classification, whereas in the multiple classification a single AUC based on e.g. a one-vs-all comparison would be appropriate)?
If so, is there a way to select a potential model based on e.g. the averaging AUC value of all the AUC values produced by the "one vs all comparisons"?
Could it also be implemented in the
metric_set
during the iterative search?
Thank you in advance for your support!
library(tidyverse)
library(tidymodels)
tidymodels_prefer()
df <- iris %>%
mutate(Species = factor(Species,levels = c("virginica", "versicolor", "setosa")))
splits <- initial_split(df, strata = Species, prop = 4/5)
df_train <- training(splits)
df_test <- testing(splits)
df_rec <-
recipe(Species ~ ., data = df_train)
knn_model <- nearest_neighbor(neighbors = tune()) %>%
set_engine("kknn") %>%
set_mode("classification")
df_wflow <-
workflow() %>%
add_model(knn_model) %>%
add_recipe(df_rec)
set.seed(2023)
knn_cv <-
df_wflow %>%
tune_bayes(
metrics = metric_set(roc_auc),
resamples = vfold_cv(df_train, strata = "Species", v = 2),
control = control_bayes(verbose = TRUE, save_pred = TRUE)
)
cv_train_metrics <- knn_cv %>%
collect_predictions() %>%
group_by(.config, id) %>%
roc_auc(truth = Species, .pred_setosa, .pred_virginica,.pred_versicolor)
CodePudding user response:
roc_auc()
expects that the columns that have the probability estimates are in the same order as the factor levels. We'll make the documentation better for that.
By default, we use the method of Hand and Till to compute the area under a single muticlass ROC curve.
So this is not doing multiple ROC curves by default. You can change the estimator
argument to do different types of averaging methods though but I would not suggest it for this metric.