I am applying grid search on Logistic Regression in order to find the combination of parameters that achieves the best accuracy. In this part of code I tuned only two hyperparameters (learning rate and iterations or "n_steps"), but I have some difficulties if I want to tune more than 2 parameters (for example learning_rate, iterations and regularization factor or "lmd").
Note: I need to do everything from scratch, so I can't use sklearn but only numpy
This is my code where I tuned learning_rate and the number of iterations:
max_accuracy = 0
learning_rates = [0.01, 0.02, 0.03, 0.04, 0.05, 0.001, 0.002, 0.003, 0.004, 0.005]
iterations = [1000, 1500, 2000, 2500, 3000]
parameters = []
for i in learning_rates:
for j in iterations:
parameters.append((i, j))
print("Possible combinations: ", parameters)
for k in range(len(parameters)):
model = LogisticRegression(learning_rate=parameters[k][0], n_steps=parameters[k][1], n_features=X_train.shape[1], lmd=2)
model.fit_reg(X_train, y_train, X_valid, y_valid)
Y_pred = model.predict(X_test, thrs=0.5)
How do I change the code if I want to tune learning_rate, n_steps and lmd?
CodePudding user response:
I'm going to use your code, and I'll modify it as little as possible. I'm also going to use [1, 2, 3] as the list of possible values for lmd, but you can change these values with the ones that you want to try.
max_accuracy = 0
learning_rates = [0.01, 0.02, 0.03, 0.04, 0.05, 0.001, 0.002, 0.003, 0.004, 0.005]
iterations = [1000, 1500, 2000, 2500, 3000]
lmd = [1, 2, 3]
parameters = []
for i in learning_rates:
for j in iterations:
for k in lmd:
parameters.append((i, j, k))
print("Possible combinations: ", parameters)
for k in range(len(parameters)):
model = LogisticRegression(learning_rate=parameters[k][0], n_steps=parameters[k][1], n_features=X_train.shape[1], lmd=parameters[k][2])
model.fit_reg(X_train, y_train, X_valid, y_valid)
Y_pred = model.predict(X_test, thrs=0.5)
Basically, to add the third hyperparameter, you just add a third list over which you are going to iterate to create your list of hyperparameters, and an additional nested for loop to add all the possible combinations of hyperparameters. You can do the same with a fourth hyperparameter, and so on.
Good luck!
CodePudding user response:
We can use below code where the grid_search function takes as input a model, the training data X and y, and a param_grid dictionary, which defines the hyperparameters and their possible values. The function iterates over all possible combinations of hyperparameters, fits the model with each combination, and computes the score. At end function returns the combination with the highest score as the best hyperparameters and best score.
import itertools
import numpy as np
def grid_search(model, X, y, param_grid):
best_score = -np.inf
best_params = {}
for combination in itertools.product(*param_grid.values()):
params = dict(zip(param_grid.keys(), combination))
model.set_params(**params)
model.fit(X, y)
score = model.score(X, y)
if score > best_score:
best_score = score
best_params = params
return best_params, best_score
param_grid = {
'a': [1, 2, 3],
'b': [4, 5, 6],
'c': [7, 8, 9]
}
model = LogisticRegression()
best_params, best_score = grid_search(model, X, y, param_grid)