I'm learning machine learning and am not so clear about this. I saw similar post in Stack Overflow but I may need your help little bit more to understand.
Code 1
kfold = model_selection.KFold(n_splits=10, random_state=7, shuffle=True)
lrCV = LogisticRegression(class_weight='balanced')
scoring = 'roc_auc'
lr_results = model_selection.cross_val_score(lrCV, X_train, y_train, cv=kfold, scoring=scoring)
Result
array([0.91374269, 0.70209059, 0.89164087, 0.8021978 , 0.85077519,
0.80888889, 0.79338843, 0.76446281, 0.84803002, 0.74506579])
Code 2
lrmodel = LogisticRegression(class_weight='balanced')
lrmodel.fit(X_train, y_train)
lr_auc = roc_auc_score(y_test, lrmodel.predict(X_test))
Result
Logistic Regression AUC = 0.67
Obviously Code 2 is a lot lower than Code 1. What is causing this difference? I'm new to ML and self-learning. Please let me know little bit more about this.
CodePudding user response:
First of all, both models are calculating the AUC exactly the same way. Both instances use the sklearn roc_auc_score.
Code 1: https://scikit-learn.org/stable/modules/model_evaluation.html
Code 2: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html
For Code 1, if you press the link next to roc_auc, you will end up on the same page as for Code 2.
I think the difference is simply due to the fact that for Code 1, you are using K-fold CV with shuffling. While for Code 2, you are using a train-test split. I don't know any specifics about your data at hand but since there is such a large discrepancy, what is your sample size? If your sample size is quite small, you could be witnessing a bias due to your training-test splits.