Home > Net >  How to build SHAP summary plot after creating ML model by GridSearchCV in Python?
How to build SHAP summary plot after creating ML model by GridSearchCV in Python?

Time:08-09

I use code as below to create ML model in Python by using GridSearchCV. Now I need to make a SHAP summary plot, how can I do that after building my model using GridSearchCV ?

import pandas as pd
import numpy as np
from xgboost import XGBClassifier
from sklearn.model_selection import GridSearchCV
np.random.seed(42)

# generate some dummy data
df = pd.DataFrame(data=np.random.normal(loc=0, scale=1, size=(100, 3)), columns=['x1', 'x2', 'x3'])
df['y'] = np.where(df.mean(axis=1) > 0, 1, 0)

# find the best model
X = df.drop(labels=['y'], axis=1)
y = df['y']

parameters = {
    'n_estimators': [100, 500, 1000],
    'subsample': [0.01, 0.05]
}

clf = GridSearchCV(
    param_grid=parameters,
    estimator=XGBClassifier(random_state=42),
    scoring='roc_auc',
    cv=4,
    verbose=0
)

clf.fit(X, y)

# get the feature importances
importances = clf.best_estimator_.get_booster().get_score(importance_type='gain')
print(importances)

clf is a fitted GridSearchCV, I am able to calculate importance of features, but how to build SHAP summary plot having GridSearch in Python ?

CodePudding user response:

Here's how you can do it -

import shap

model = clf.best_estimator_
explainer = shap.Explainer(model)
shap_values = explainer(X)

shap.summary_plot(shap_values, X)
  • Related