I'm trying to follow the example on chapter 10 of the book Hands-On Machine Learning with SciKit-Learn, Keras and TensorFlow which regard the optimization of the hyperparameters of a DNN model.
The dataset is the MNIST fashion model and the goal of the project is the classification of the images in 10 classes.
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()
There is no validation, so I'm going to create that set using the first 5k elements:
X_valid, X_train = X_train_full[:5000], X_train_full[5000:]
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]
A possible implementation of a simple DNN model is the following, using sequential Keras API:
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28, 28]))
model.add(keras.layers.Dense(300, activation="relu"))
model.add(keras.layers.Dense(200, activation="relu"))
model.add(keras.layers.Dense(100, activation="relu"))
model.add(keras.layers.Dense(10, activation="softmax"))
The book then suggests to study the hyper-parameter space to found the best ones, using RandomizedSearchCV
. The example uses keras.wrappers.scikit_learn.KerasRegressor
which is now deprecated in favor of KerasRegressor
by SciKeras.
I created a function containing the ML model:
input_shape=X_train[0].shape
def build_model(n_hidden = 1, n_neurons = 30, learning_rate=3e-3, input_shape=input_shape):
grid_model = keras.models.Sequential()
grid_model.add(keras.layers.Flatten(input_shape=input_shape))
for layer in range(n_hidden):
grid_model.add(keras.layers.Dense(n_neurons, activation="relu"))
grid_model.add(keras.layers.Dense(10, activation="softmax"))
opt = tf.keras.optimizers.SGD(learning_rate=learning_rate)
grid_model.compile(loss="sparse_categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
return grid_model
Then I defined the model hyperparameters to explore:
param_distribs = {
"n_hidden": [0, 1, 2, 3, 4, 5],
"n_neurons": np.arange(1, 300),
"learning_rate": reciprocal(3e-4, 3e-2)
}
I then used SciKeras to create a wrapper around the Keras model, feeding the parameter space:
keras_reg = KerasRegressor(build_model, n_hidden=param_distribs["n_hidden"], n_neurons=param_distribs["n_neurons"], learning_rate=param_distribs["learning_rate"], verbose=1)
The last step is to define a RandomizedSearchCV
object and start the research using the fit
method:
rnd_search_cv = RandomizedSearchCV(keras_reg, param_distribs, n_iter=10, cv=3)
rnd_search_cv.fit(X_train, y_train, epochs=100, validation_data=(X_valid, y_valid), callbacks=[keras.callbacks.EarlyStopping(patience=10)])
This last row gives me the following error, for each epoch:
/home/docker_user/.local/lib/python3.8/site-packages/sklearn/model_selection/_validation.py:776: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/docker_user/.local/lib/python3.8/site-packages/sklearn/model_selection/_validation.py", line 767, in _score
scores = scorer(estimator, X_test, y_test)
File "/home/docker_user/.local/lib/python3.8/site-packages/sklearn/metrics/_scorer.py", line 429, in _passthrough_scorer
return estimator.score(*args, **kwargs)
File "/home/docker_user/.local/lib/python3.8/site-packages/scikeras/wrappers.py", line 1100, in score
return self.scorer(y, y_pred, sample_weight=sample_weight, **score_args)
File "/home/docker_user/.local/lib/python3.8/site-packages/scikeras/wrappers.py", line 1697, in scorer
return sklearn_r2_score(y_true, y_pred, **kwargs)
File "/home/docker_user/.local/lib/python3.8/site-packages/sklearn/metrics/_regression.py", line 911, in r2_score
y_type, y_true, y_pred, multioutput = _check_reg_targets(
File "/home/docker_user/.local/lib/python3.8/site-packages/sklearn/metrics/_regression.py", line 100, in _check_reg_targets
check_consistent_length(y_true, y_pred)
File "/home/docker_user/.local/lib/python3.8/site-packages/sklearn/utils/validation.py", line 387, in check_consistent_length
raise ValueError(
ValueError: Found input variables with inconsistent numbers of samples: [18334, 183340]
A factor 10 on the second dimension makes me thinking... I also checked the shapes of the data and they are fine...
print(X_train.shape, y_train.shape)
(55000, 28, 28) (55000,)
Can you please help me dealing with this error?
CodePudding user response:
As the name suggests, scikeras.wrappers.KerasRregressor
should be employed for regression and uses the r2_score
by default. Hence, it will not convert the softmax output to a class index like the ones you have in your validation set. The last dense layer's output is of size 10
(hence the factor of 10). Instead, you should use scikeras.wrappers.KerasClassifier
that uses accuracy as scoring by default. Simply swapping KerasRegressor
out for KerasClassifier
should make it work.