Roc_auc percentage calculation giving nan-CodePudding

I am trying to calculate the ROC score for my data but it is resulting in nan.

The code:

scoring = 'roc_auc'
kfold= KFold(n_splits=10, random_state=42, shuffle=True)
model = LinearDiscriminantAnalysis()
results = cross_val_score(model, df_n, y, cv=kfold, scoring=scoring)
print("AUC: %.3f (%.3f)" % (results.mean(), results.std()))

df_n is an array from the normalised values, I also tried it just with the X data value from the dataset. y is an array of binary values.

df_n shape: (150, 4) y shape: (150,)

I am stumped, it should work!

CodePudding user response：

The problem is that roc_auc_score expects the probabilities and not the predictions in the case of multi-class classification. However, with that code the score is getting the output of predict instead.

Use a new scorer:

from sklearn.metrics import roc_auc_score, make_scorer

multi_roc_scorer = make_scorer(lambda y_in, y_p_in: roc_auc_score(y_in, y_p_in, multi_class='ovr'), needs_proba=True)
scores = cross_validate(model, X_s, y_s, scoring=multi_roc_scorer, cv=cv, error_score="raise")