I'm building a logistic regression model to predict a binary target feature. I want to try different values of different parameters using the param_grid
argument, to find the best fit with the best values. This is my code:
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.25, random_state = 42)
logModel = LogisticRegression(C = 1, penalty='l1',solver='liblinear');
Grid_params = {
"penalty" : ['l1','l2','elasticnet','none'],
"C" : [0.001, 0.01, 0.1, 1, 10, 100, 1000], # Basically smaller C specify stronger regularization.
'solver' : ['lbfgs','newton-cg','liblinear','sag','saga'],
'max_iter' : [50,100,200,500,1000,2500]
}
clf = GridSearchCV(logModel, param_grid=Grid_params, cv = 10, verbose = True, n_jobs=-1,error_score='raise')
clf_fitted = clf.fit(X_train,Y_train)
And this is where I get the error. I have read already that some solvers
dont work with l1
, and some don't work with l2
. How can I tune the param_grid
in this case?
I tried also using only simple logModel = LogisticRegression()
but didn't work.
Full error:
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.
CodePudding user response:
Gridsearch accepts the list of dicts for that purpose, given you absolutely need to include solvers into grid, you should be able to do something like this:
Grid_params = [
{'solver' : ['saga'],
'penalty' : ['elasticnet', 'l1', 'l2', 'none'],
'max_iter' : [50,100,200,500,1000,2500],
'C' : [0.001, 0.01, 0.1, 1, 10, 100, 1000]},
{'solver' : ['newton-cg', 'lbfgs'],
'penalty' : ['l2','none'],
'max_iter' : [50,100,200,500,1000,2500],
'C' : [0.001, 0.01, 0.1, 1, 10, 100, 1000]},
...
]