I want to perform a random search, in classification problem, where the scoring method will be chosen as AUC instead of accuracy score. Have a look at my code for reproducibility:
# Define imports and create data
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
x = np.random.normal(0, 1, 100)
y = np.random.binomial(0, 1, 100)
### Let's define parameter grid
rf = RandomForestClassifier(random_state=0)
n_estimators = [int(x) for x in np.linspace(start=200, stop=2000, num=4)]
min_samples_split = [2, 5, 10]
param_grid = {'n_estimators': n_estimators,
'min_samples_split': min_samples_split}
# Define model
clf = RandomizedSearchCV(rf,
param_grid,
random_state=0,
n_iter=3,
cv=5).fit(x.reshape(-1, 1), y)
And now, according to documentation of function RandomizedSearchCV
I can pass another argument scoring
which will choose metric to evaluate the model. I tried to pass scoring = auc
, but I got an error that there is no such metric. Do you know what I have to do to have AUC instead of accuracy?
CodePudding user response:
According to documentation of function RandomizedSearchCV scoring can be a string or a callable. Here you can find all possible string values for the score parameter. You can also try to set score as a callable auc.
CodePudding user response:
As explained by Danylo and this answer you can specify the search optimal function to be the ROC-AUC, so as to pick the parameter value maximizing it:
clf = RandomizedSearchCV(rf,
param_grid,
random_state=0,
n_iter=3,
cv=5,
scoring='roc_auc').fit(x.reshape(-1, 1), y)