How to set AUC as scoring method while searching for hyperparameters?-CodePudding

I want to perform a random search, in classification problem, where the scoring method will be chosen as AUC instead of accuracy score. Have a look at my code for reproducibility:

# Define imports and create data 
import numpy as np
from sklearn.ensemble import RandomForestClassifier 
from sklearn.model_selection import RandomizedSearchCV

x = np.random.normal(0, 1, 100)
y = np.random.binomial(0, 1, 100)


### Let's define parameter grid

rf = RandomForestClassifier(random_state=0)

n_estimators = [int(x) for x in np.linspace(start=200, stop=2000, num=4)]
min_samples_split = [2, 5, 10]
param_grid = {'n_estimators': n_estimators,
               'min_samples_split': min_samples_split}


# Define model
clf = RandomizedSearchCV(rf, 
                         param_grid, 
                         random_state=0, 
                         n_iter=3, 
                         cv=5).fit(x.reshape(-1, 1), y)

And now, according to documentation of function RandomizedSearchCV I can pass another argument scoring which will choose metric to evaluate the model. I tried to pass scoring = auc, but I got an error that there is no such metric. Do you know what I have to do to have AUC instead of accuracy?

CodePudding user response：

According to documentation of function RandomizedSearchCV scoring can be a string or a callable. Here you can find all possible string values for the score parameter. You can also try to set score as a callable auc.

CodePudding user response：

As explained by Danylo and this answer you can specify the search optimal function to be the ROC-AUC, so as to pick the parameter value maximizing it:

clf = RandomizedSearchCV(rf, 
                         param_grid, 
                         random_state=0, 
                         n_iter=3, 
                         cv=5,
                         scoring='roc_auc').fit(x.reshape(-1, 1), y)