Hyperparameter Tuning (Keras) a Neural Network Regression-CodePudding

We have developed an Artificial Neural Network in Python, and in that regard we would like tune the hyperparameters with GridSearchCV to find the best possible hyperparameters. The goal of our ANN is to predict temperature based on other relevant features, and so far this is the evaluation of the performance of the neural network:

Coefficient of Determination (R2)    Root Mean Square Error (RMSE)    Mean Squared Error (MSE)    Mean Absolute Percent Error (MAPE)    Mean Absolute Error (MAE)    Mean Bias Error (MBE)
0.9808840288506496                   0.7527763482280911               0.5666722304516204          0.09142692180578049                   0.588041786518511           -0.07293321963266877

As of now, we have no clue on how to utilize GridSearchCV correctly, and we therefore seek help to move us towards a solution that would satisfy our goal. We have a function that might work, but are not able to apply it correctly to our code.

This is the hyperparameter tuning function (GridSearchCV):

def hyperparameterTuning():
    # Listing all the parameters to try
    Parameter_Trials = {'batch_size': [10, 20, 30],
                    'epochs': [10, 20],
                    'Optimizer_trial': ['adam', 'rmsprop']
                    }

    # Creating the regression ANN model
    RegModel = KerasRegressor(make_regression_ann, verbose=0)

    # Creating the Grid search space
    grid_search = GridSearchCV(estimator=RegModel,
                           param_grid=Parameter_Trials,
                           scoring=None,
                           cv=5)

    # Running Grid Search for different paramenters
    grid_search.fit(X, y, verbose=1)

    print('### Printing Best parameters ###')
    grid_search.best_params_

Our main function:

if __name__ == '__main__':

    print('--------------')

    dataframe = pd.read_csv("/.../file.csv")
    
    # Splitting data into training and tesing data
    X_train, X_test, y_train, y_test, PredictorScalerFit, TargetVarScalerFit = splitData(dataframe=dataframe)
    
    # Making the Regression Artificial Neural Network (ANN)
    ann = ANN(X_train=X_train, y_train=y_train, X_test=X_test, y_test=y_test, PredictorScalerFit=PredictorScalerFit, TargetVarScalerFit=TargetVarScalerFit)

    # Evaluation of the performance of the Aritifical Neural Network (ANN)
    eval = evaluation(y_test_orig=ann['temp'], y_test_pred=ann['Predicted_temp'])

Our function to split data into training and testing data:

def splitData(dataframe):

    X = dataframe[Predictors].values
    y = dataframe[TargetVariable].values

    ### Sandardization of data ###
    PredictorScaler = StandardScaler()
    TargetVarScaler = StandardScaler()

    # Storing the fit object for later reference
    PredictorScalerFit = PredictorScaler.fit(X)
    TargetVarScalerFit = TargetVarScaler.fit(y)

    # Generating the standardized values of X and y
    X = PredictorScalerFit.transform(X)
    y = TargetVarScalerFit.transform(y)

    # Split the data into training and testing set
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    return X_train, X_test, y_train, y_test, PredictorScalerFit, TargetVarScalerFit

Our function to fit the model and to utilize the Artificial Neural Network (ANN)

def ANN(X_train, y_train, X_test, y_test, TargetVarScalerFit, PredictorScalerFit):

    model = make_regression_ann()

    # Fitting the ANN to the Training set
    model.fit(X_train, y_train, batch_size=5, epochs=100, verbose=1)

    # Generating Predictions on testing data
    Predictions = model.predict(X_test)

    # Scaling the predicted temp data back to original price scale
    Predictions = TargetVarScalerFit.inverse_transform(Predictions)

    # Scaling the y_test temp data back to original temp scale
    y_test_orig = TargetVarScalerFit.inverse_transform(y_test)

    # Scaling the test data back to original scale
    Test_Data = PredictorScalerFit.inverse_transform(X_test)

    TestingData = pd.DataFrame(data=Test_Data, columns=Predictors)
    TestingData['temp'] = y_test_orig
    TestingData['Predicted_temp'] = Predictions
    TestingData.head()

    # Computing the absolute percent error
    APE = 100 * (abs(TestingData['temp'] - TestingData['Predicted_temp']) / TestingData['temp'])
    TestingData['APE'] = APE

    # ...
    TestingData = TestingData.round(2)

    TestingData.to_csv("TestingData.csv")

    return TestingData

Our function to make the model of the ANN

def make_regression_ann():
    # create ANN model
    model = Sequential()

    # Defining the Input layer and FIRST hidden layer, both are same!
    model.add(Dense(units=8, input_dim=7, kernel_initializer='normal', activation='sigmoid'))

    # Defining the Second layer of the model
    # after the first layer we don't have to specify input_dim as keras configure it automatically
    model.add(Dense(units=6, kernel_initializer='normal', activation='sigmoid'))

    # The output neuron is a single fully connected node
    # Since we will be predicting a single number
    model.add(Dense(1, kernel_initializer='normal'))

    # Compiling the model
    model.compile(loss='mean_squared_error', optimizer='adam')

    return model

Our function to evaluate the performance of the ANN

def evaluation(y_test_orig, y_test_pred):

    # Computing the Mean Absolute Percent Error
    MAPE = mean_absolute_percentage_error(y_test_orig, y_test_pred)

    # Computing R2 Score
    r2 = r2_score(y_test_orig, y_test_pred)

    # Computing Mean Square Error (MSE)
    MSE = mean_squared_error(y_test_orig, y_test_pred)

    # Computing Root Mean Square Error (RMSE)
    RMSE = mean_squared_error(y_test_orig, y_test_pred, squared=False)

    # Computing Mean Absolute Error (MAE)
    MAE = mean_absolute_error(y_test_orig, y_test_pred)

    # Computing Mean Bias Error (MBE)
    MBE = np.mean(y_test_pred - y_test_orig)  # here we calculate MBE

    print('--------------')

    print('The Coefficient of Determination (R2) of ANN model is:', r2)
    print("The Root Mean Squared Error (RMSE) of ANN model is:", RMSE)
    print("The Mean Squared Error (MSE) of ANN model is:", MSE)
    print('The Mean Absolute Percent Error (MAPE) of ANN model is:', MAPE)
    print("The Mean Absolute Error (MAE) of ANN model is:", MAE)
    print("The Mean Bias Error (MBE) of ANN model is:", MBE)

    print('--------------')

    eval_list = [r2, RMSE, MSE, MAPE, MAE, MBE]
columns = ['Coefficient of Determination (R2)', 'Root Mean Square Error (RMSE)', 'Mean Squared Error (MSE)',
           'Mean Absolute Percent Error (MAPE)', 'Mean Absolute Error (MAE)', 'Mean Bias Error (MBE)']

    dataframe = pd.DataFrame([eval_list], columns=columns)

    return dataframe

CodePudding user response：

Your code should work if you update the make_regression_ann function to include any hyperparameters that you want to optimize as inputs, with the exception of the fitting parameters.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import make_regression

def make_regression_ann(initializer='uniform', activation='relu', optimizer='adam', loss='mse'):

    model = Sequential()
    model.add(Dense(units=8, input_dim=7, kernel_initializer=initializer, activation=activation))
    model.add(Dense(units=6, kernel_initializer=initializer, activation=activation))
    model.add(Dense(1, kernel_initializer=initializer))
    model.compile(loss=loss, optimizer=optimizer)

    return model

param_grid = {
    'initializer': ['normal', 'uniform'],
    'activation': ['relu', 'sigmoid'],
    'optimizer': ['adam', 'rmsprop'],
    'loss': ['mse', 'mae'],
    'batch_size': [32, 64],
    'epochs': [5, 10],
}

grid_search = GridSearchCV(
    estimator=KerasRegressor(make_regression_ann, verbose=0),
    param_grid=param_grid,
    scoring='neg_mean_absolute_percentage_error',
    cv=3,
)

X, y = make_regression(n_features=7, n_samples=100, random_state=42)

grid_search.fit(X, y, verbose=1)

grid_search.best_params_
# {'activation': 'sigmoid',
#  'batch_size': 32,
#  'epochs': 10,
#  'initializer': 'normal',
#  'loss': 'mae',
#  'optimizer': 'adam'}

CodePudding user response：

The way I used GridSearchCV successfully, recently was:

tuned_parameters2 = {'C': [1,10,100,10000], 'max_iter':[5000,10000,50000]}
model2 = GridSearchCV(svm.LinearSVC(), tuned_parameters2)
model2.fit(features, y_train)

So separate dictionary with hyperparameters, then assign your model to GridSearchCV(make_regression_ann, the_hyperparam_dict). Then fit it with the data.

In your case this approach would require more refactoring. It’s up to you to decide if maybe it’s better to feed ANN to GridSearchCV.