My situation is that: multiclass classification problem, with 5 features (columns in my data), 15 classes, single label. My model is : one input layer with 5 neurons, just one hidden layer with ReLU, and one output layer with softmax. I have two questions:
- How many neurons for the input layer? Is it certain that it is set according to the number of features plus bias? I tried tweaking the number of neurons in the input layer, say 77 neurons, the performance improved so I am confused.
- I tried Randomized Search cv to find the number of hidden layer, number of neurons and learning rate, I used Randomizedsearchcv in Scikit learn, then the best_params will display something like this:
{'learning_rate': 0.0023716395806862335, 'n_layer': 1, 'n_neurons': 291}
So, the question is that, let's say,if it showed best_params 'n_layer': 2, but 'n_neurons': 291. so is it interpreted as 291 neurons per each layer, and 2 hidden layers in the model?
Thank you in advance!
CodePudding user response:
The answer to your first question: input layer's shape set base num of features. In your problem you need 5 features then the input layer needs to be 5 and in my example, I have 784 features then the input layer shape should be 784.
Yes, We have the rule to find the number of neurons in the layer for DNN. I highly recommend you to use Keras Tuner
. KerasTuner finds the best hyperparameter values for your models with Bayesian Optimization, Hyperband, and Random Search algorithms. I write an example with the fashion_mnist
dataset with the model that you explain in your question. I use epoch=2
, you can use this search with larger epochs for your problem. For this problem, KerasTuner
finds that the best num neuron for the first layer = 416
(<- you want to find this) and best learning_rate-0.0001
.
# !pip install -q -U keras-tuner
import tensorflow as tf
import keras_tuner as kt
(img_train, label_train), (img_test, label_test) = tf.keras.datasets.fashion_mnist.load_data()
# Normalize pixel values between 0 and 1
img_train = img_train.astype('float32') / 255.0
img_train = img_train.reshape(60000, -1)
img_test = img_test.astype('float32') / 255.0
img_test = img_test.reshape(10000, -1)
label_train = tf.keras.utils.to_categorical(label_train, 10)
label_test = tf.keras.utils.to_categorical(label_test, 10)
def model_builder(hp):
model = tf.keras.Sequential()
model.add(tf.keras.layers.Input(784,))
# Tune the number of units in the first Dense layer
# Choose an optimal value between 32-512
hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
model.add(tf.keras.layers.Dense(units=hp_units, activation='relu'))
model.add(tf.keras.layers.Dense(10, activation='softmax'))
# Tune the learning rate for the optimizer
# Choose an optimal value from 0.01, 0.001, or 0.0001
hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=hp_learning_rate),
loss = 'categorical_crossentropy', metrics = ['accuracy'])
return model
tuner = kt.Hyperband(model_builder,objective='val_accuracy',max_epochs=3,
factor=3,directory='my_dir',project_name='intro_to_kt')
stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
tuner.search(img_train, label_train, epochs=2, validation_split=0.2, callbacks=[stop_early])
# Get the optimal hyperparameters
best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]
print(f"BEST num neurons for Dense Layer : {best_hps.get('units')}")
print(f"BEST learning_rate : {best_hps.get('learning_rate')}")
Output:
Trial 11 Complete [00h 00m 14s]
val_accuracy: 0.8530833125114441
Best val_accuracy So Far: 0.8823333382606506
Total elapsed time: 00h 01m 03s
INFO:tensorflow:Oracle triggered exit
BEST num neurons for Dense Layer : 416 # <- You want this
BEST learning_rate : 0.001