I have below layers in my neural network which is working on a forecasting problem.
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(2),input_shape=x_train_final.shape[-2:]),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Dense(units=horizon)
Is there a way to combine them into a single layer? When I look at my train and validation loss over epochs, it seems that there is overfitting. So I would like to simplify my network.
something like below:
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(2, recurrent_dropout=0.2),input_shape=x_train_final.shape[-2:],units=3)
update1:
I get 95 as number of hyperparameters as my input and output are of shape - (41, 2, 2) (41, 3, 1)
lstm_model = tf.keras.models.Sequential([
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(2),input_shape=x_train_final.shape[-2:]),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Dense(units=horizon),
])
lstm_model.compile(loss='mse', optimizer = 'adam')
lstm_model.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
bidirectional (Bidirectional (None, 4) 80
_________________________________________________________________
dropout_4 (Dropout) (None, 4) 0
_________________________________________________________________
dense_6 (Dense) (None, 3) 15
=================================================================
Total params: 95
Trainable params: 95
Non-trainable params: 0
_________________________________________________________________
CodePudding user response:
Maybe try something like this, but note that this does not reduce the number of training parameters:
import tensorflow as tf
class CustomBidirectional(tf.keras.layers.Layer):
def __init__(self, lstm_layer, input_shape, units):
super(CustomBidirectional, self).__init__()
self.bilstm = tf.keras.layers.Bidirectional(lstm_layer, input_shape=input_shape)
self.dense = tf.keras.layers.Dense(units=units)
def call(self, inputs):
x = self.bilstm(inputs)
return self.dense(x)
x_train_final = tf.random.normal((5, 10, 20))
bidirectional_layer = CustomBidirectional(tf.keras.layers.LSTM(2, recurrent_dropout=0.2), input_shape=x_train_final.shape[-2:], units=3)
bidirectional_layer(x_train_final)