I need to create a MLPClassifier
with hidden_layer_sizes
, that is a tuple specifying the number of neurons in the hidden layers.
For example: (10,)
means that there is only 1 hidden layer with 10 neurons. (10, 50,)
means that there are 2 hidden layers, the first with 10 neurons, the second with 50 neurons and so on. I want to test each of them in sequence.
I have passed this dictionary:
hl_parameters = {'hidden_layer_sizes': [(10,), (50,), (10,10,), (50,50,)]}
And defined MLPClassifier
like this:
mlp_cv = MLPClassifier(hidden_layer_sizes=hl_parameters['hidden_layer_sizes'], max_iter=300, alpha=1e-4, solver='sgd', tol=1e-4, learning_rate_init=.1, verbose=True, random_state=ID)
mlp_cv.fit(X_train, y_train)
But when I fit the model, I got this error:
TypeError
Traceback (most recent call last)
Input In [65], in <cell line: 9>()
8 mlp_cv = MLPClassifier(hidden_layer_sizes=hl_parameters['hidden_layer_sizes'], max_iter=300, alpha=1e-4, solver='sgd', tol=1e-4, learning_rate_init=.1, verbose=True, random_state=ID)
----> 9 mlp_cv.fit(X_train, y_train)
File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:752, in BaseMultilayerPerceptron.fit(self, X, y)
735 def fit(self, X, y):
736 """Fit the model to data matrix X and target(s) y.
737
738 Parameters
(...)
750 Returns a trained MLP model.
751 """
--> 752 return self._fit(X, y, incremental=False)
File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:385, in BaseMultilayerPerceptron._fit(self, X, y, incremental)
383 # Validate input parameters.
384 self._validate_hyperparameters()
--> 385 if np.any(np.array(hidden_layer_sizes) <= 0):
386 raise ValueError(
387 "hidden_layer_sizes must be > 0, got %s." % hidden_layer_sizes
388 )
389 first_pass = not hasattr(self, "coefs_") or (
390 not self.warm_start and not incremental
391 )
TypeError: '<=' not supported between instances of 'tuple' and 'int'
I cannot find a solution. How do I solve this?
CodePudding user response:
MLPClassifier(hidden_layer_sizes=hl_parameters['hidden_layer_sizes'], max_iter=300, alpha=1e-4, solver='sgd', tol=1e-4, learning_rate_init=.1, verbose=True, random_state=ID)
that field is an issue...you are providing a list of tuples as input for hidden_layer_sizes. MLPClassifier can only take tuple for hidden_layer_sizes.
if you need 3 hidden layers with 10, 50 and 50 neurons, just put (10,50,50) for hidden layer sizes. If you are testing different configurations, you can make a list of tuples and loop through the different combinations one at a time instead of putting the full list.
CodePudding user response:
Testing multiple architectures / hyperparameters to find the best model is a task for GridSearchCV
.
Here's an example testing the four architectures in the question:
from sklearn.datasets import make_classification
from sklearn.model_selection import GridSearchCV
from sklearn.neural_network import MLPClassifier
X, y = make_classification(n_samples=10_000)
# Initialize MLPClassifier with some parameters
clf = MLPClassifier(max_iter=300, alpha=1e-4, solver="sgd", tol=1e-4, learning_rate_init=.1)
# Search over `hidden_layer_sizes`
search = GridSearchCV(clf, param_grid={'hidden_layer_sizes': [(10,), (50,), (10,10,), (50,50,)]}, n_jobs=-1, verbose=3)
search.fit(X, y)
print(search.best_params_)
Which shows us that the best cross-validation performance hidden_layer_sizes=(10, 10,)
{'hidden_layer_sizes': (10, 10)}