Home > Software engineering >  tensorflow bidirectional layer define output shape
tensorflow bidirectional layer define output shape

Time:07-21

I have below layers in my neural network which is working on a forecasting problem.

tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(2),input_shape=x_train_final.shape[-2:]),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Dense(units=horizon)

Is there a way to combine them into a single layer? When I look at my train and validation loss over epochs, it seems that there is overfitting. So I would like to simplify my network.

something like below:

tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(2, recurrent_dropout=0.2),input_shape=x_train_final.shape[-2:],units=3)

update1:

I get 95 as number of hyperparameters as my input and output are of shape - (41, 2, 2) (41, 3, 1)

lstm_model = tf.keras.models.Sequential([
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(2),input_shape=x_train_final.shape[-2:]),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Dense(units=horizon),
])

lstm_model.compile(loss='mse', optimizer = 'adam')

lstm_model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bidirectional (Bidirectional (None, 4)                 80        
_________________________________________________________________
dropout_4 (Dropout)          (None, 4)                 0         
_________________________________________________________________
dense_6 (Dense)              (None, 3)                 15        
=================================================================
Total params: 95
Trainable params: 95
Non-trainable params: 0
_________________________________________________________________

CodePudding user response:

Maybe try something like this, but note that this does not reduce the number of training parameters:

import tensorflow as tf

class CustomBidirectional(tf.keras.layers.Layer):
    def __init__(self, lstm_layer, input_shape,  units):
        super(CustomBidirectional, self).__init__()
        self.bilstm = tf.keras.layers.Bidirectional(lstm_layer, input_shape=input_shape)
        self.dense = tf.keras.layers.Dense(units=units)

    def call(self, inputs):
        x = self.bilstm(inputs)
        return self.dense(x)

x_train_final = tf.random.normal((5, 10, 20))

bidirectional_layer = CustomBidirectional(tf.keras.layers.LSTM(2, recurrent_dropout=0.2), input_shape=x_train_final.shape[-2:], units=3)
bidirectional_layer(x_train_final)
  • Related