Error in tidymodels - workflowsets : The provided `grid` has the following parameter columns that ha-CodePudding

I try to use workflowsets package or approach in which I get an error. Here are the R codes (Sorry, the codes are quite long):

# Package ----
library(finetune)
library(themis)
library(tidymodels)

# Data ----
data("PimaIndiansDiabetes", package = "mlbench")

table(PimaIndiansDiabetes$diabetes)
str(PimaIndiansDiabetes)
PimaIndiansDiabetes <- 
  PimaIndiansDiabetes %>% 
  mutate(diabetes = relevel(diabetes, "pos"))

# Split ----
set.seed(123)
ind <- initial_split(PimaIndiansDiabetes, strata = diabetes)

dat_train <- training(ind)
dat_test <- testing(ind)

# CV ----
set.seed(123)
dat_cv <- vfold_cv(dat_train, v = 10)

# Recipe ----
dat_rec <- 
  dat_train %>% 
  recipe(diabetes ~.) %>% 
  step_normalize(all_numeric_predictors()) %>% 
  step_smote(diabetes)

# Model ----
parsnip_nn <- 
  mlp(hidden_units = tune(),
      penalty = tune(),
      epochs = tune()) %>% 
  set_mode("classification") %>% 
  set_engine("nnet")

parsnip_log <- 
  logistic_reg(penalty = tune(),
               mixture = tune()) %>% 
  set_engine("glmnet")

# Latin hypercube grid ----
latin_grid <- 
  grid_latin_hypercube(penalty(),
                       mixture(),
                       hidden_units(),
                       epochs(),
                       size = 30)

# Tuning ----
race_ctrl <-
  control_race(
    save_pred = T,
    save_workflow = T,
    verbose = T
  )

class_metrics <- metric_set(accuracy, 
                            f_meas, 
                            j_index, 
                            kap, 
                            precision, 
                            sensitivity, 
                            specificity, 
                            roc_auc, 
                            mcc, 
                            pr_auc)

Tuned_results <- 
  workflow_set(
    preproc = list(rec = dat_rec),
    models = list(parsnip_nn = parsnip_nn,
                  parsnip_log = parsnip_log)
  ) %>% 
  workflow_map(
    fn = "tune_race_anova", 
    seed = 123,
    grid = latin_grid,
    resamples = dat_cv,
    verbose = T,
    metrics = class_metrics,
    control = race_ctrl
  )

This is the error that I get, which basically said some of the parameters of the models are not recognised by tune().

i 1 of 2 tuning: rec_parsnip_nn
x 1 of 2 tuning: rec_parsnip_nn failed with: Error in check_grid(grid = grid, workflow = workflow, pset = pset) : The provided `grid` has the following parameter columns that have not been marked for tuning by `tune()`: 'mixture'.
i 2 of 2 tuning: rec_parsnip_log
x 2 of 2 tuning: rec_parsnip_log failed with: Error in check_grid(grid = grid, workflow = workflow, pset = pset) : The provided `grid` has the following parameter columns that have not been marked for tuning by `tune()`: 'hidden_units', 'epochs'.

And if we check the grid_results:

# A workflow set/tibble: 2 x 4
  wflow_id        info             option    result        
  <chr>           <list>           <list>    <list>        
1 rec_parsnip_nn  <tibble [1 x 4]> <opts[4]> <try-errr [1]>
2 rec_parsnip_log <tibble [1 x 4]> <opts[4]> <try-errr [1]>

I am not sure why the parameters such as mixture, hidden_units and epochs are not recognised by tune(). Any idea where I did wrong?

CodePudding user response：

The neural net doesn't have a parameter called mixture, and the regularized regression model doesn't have parameters called hidden_units or epochs. You can't use the same grid of parameters for both of the models because they don't have the same hyperparameters. Instead, you will want to:

create separate grids for the two models
use option_add() to add each grid to its model via the id argument

Also check out Chapter 15 of TMwR to see more about how to add an option to only a specific workflow. Since you are using a Latin hybercube, which is the default in tidymodels, you might want to just skip all that and use grid = 30 instead.