How can I pass a combination of architectures to a MLPClassifier?-CodePudding

I need to create a MLPClassifier with hidden_layer_sizes, that is a tuple specifying the number of neurons in the hidden layers.

For example: (10,) means that there is only 1 hidden layer with 10 neurons. (10, 50,) means that there are 2 hidden layers, the first with 10 neurons, the second with 50 neurons and so on. I want to test each of them in sequence.

I have passed this dictionary:

hl_parameters = {'hidden_layer_sizes': [(10,), (50,), (10,10,), (50,50,)]}

And defined MLPClassifier like this:

mlp_cv = MLPClassifier(hidden_layer_sizes=hl_parameters['hidden_layer_sizes'], max_iter=300,       alpha=1e-4, solver='sgd', tol=1e-4, learning_rate_init=.1, verbose=True, random_state=ID)
mlp_cv.fit(X_train, y_train)

But when I fit the model, I got this error:

TypeError                                    

    Traceback (most recent call last)
Input In [65], in <cell line: 9>()
      8 mlp_cv = MLPClassifier(hidden_layer_sizes=hl_parameters['hidden_layer_sizes'], max_iter=300, alpha=1e-4, solver='sgd', tol=1e-4, learning_rate_init=.1, verbose=True, random_state=ID)
----> 9 mlp_cv.fit(X_train, y_train)

File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:752, in BaseMultilayerPerceptron.fit(self, X, y)
    735 def fit(self, X, y):
    736     """Fit the model to data matrix X and target(s) y.
    737 
    738     Parameters
   (...)
    750         Returns a trained MLP model.
    751     """
--> 752     return self._fit(X, y, incremental=False)

File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:385, in BaseMultilayerPerceptron._fit(self, X, y, incremental)
    383 # Validate input parameters.
    384 self._validate_hyperparameters()
--> 385 if np.any(np.array(hidden_layer_sizes) <= 0):
    386     raise ValueError(
    387         "hidden_layer_sizes must be > 0, got %s." % hidden_layer_sizes
    388     )
    389 first_pass = not hasattr(self, "coefs_") or (
    390     not self.warm_start and not incremental
    391 )

TypeError: '<=' not supported between instances of 'tuple' and 'int'

I cannot find a solution. How do I solve this?

CodePudding user response：

MLPClassifier(hidden_layer_sizes=hl_parameters['hidden_layer_sizes'], max_iter=300, alpha=1e-4, solver='sgd', tol=1e-4, learning_rate_init=.1, verbose=True, random_state=ID)

that field is an issue...you are providing a list of tuples as input for hidden_layer_sizes. MLPClassifier can only take tuple for hidden_layer_sizes.

if you need 3 hidden layers with 10, 50 and 50 neurons, just put (10,50,50) for hidden layer sizes. If you are testing different configurations, you can make a list of tuples and loop through the different combinations one at a time instead of putting the full list.

CodePudding user response：

Testing multiple architectures / hyperparameters to find the best model is a task for GridSearchCV.

Here's an example testing the four architectures in the question:

from sklearn.datasets import make_classification
from sklearn.model_selection import GridSearchCV
from sklearn.neural_network import MLPClassifier

X, y = make_classification(n_samples=10_000)

# Initialize MLPClassifier with some parameters
clf = MLPClassifier(max_iter=300, alpha=1e-4, solver="sgd", tol=1e-4, learning_rate_init=.1)

# Search over `hidden_layer_sizes`
search = GridSearchCV(clf, param_grid={'hidden_layer_sizes': [(10,), (50,), (10,10,), (50,50,)]}, n_jobs=-1, verbose=3)
search.fit(X, y)

print(search.best_params_)

Which shows us that the best cross-validation performance hidden_layer_sizes=(10, 10,)

{'hidden_layer_sizes': (10, 10)}