What should be Output shape of keras model layers-CodePudding

i am bit confused about the output shape of keras layer. I have created a sample keras model and also displayed its summary.

numberOfLSTMcells=1
n_timesteps_in=129
n_features=61
inp =Input(shape=(n_timesteps_in, n_features))
lstm= LSTM(numberOfLSTMcells,return_sequences=True, return_state=False) (inp)
fc=Dense(64,activation='relu',name='hidden_layer')(lstm)
out=Dense(1,activation='sigmoid',name='last_layer')(fc)
model = Model(inputs=inp, outputs=out)

Summary of model

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_3 (InputLayer)         [(None, 129, 61)]         0         
_________________________________________________________________
lstm_2 (LSTM)                (None, 129, 1)            252       
_________________________________________________________________
hidden_layer (Dense)         (None, 129, 64)           128       
_________________________________________________________________
last_layer (Dense)           (None, 129, 1)            65        
=================================================================
Total params: 445
Trainable params: 445
Non-trainable params: 0

What i think the shape of last layer should be (None,64,1). Becuase hidden_layers has 64 neurons which goes as input to last_layer

CodePudding user response：

Since you set the parameter return_sequences to True in the LSTM layer, you are getting a sequence with the same number of time steps as your input and an output space of 1 for each timestep, hence the shape (None, 129, 1). Afterwards, you apply a Dense layer to this tensor, but this layer is always applied to the last dimension of a tensor, which in your case is 1 and not 129. Therefore you get the output (None, 129, 64). Then, you use a final output layer, which is also applied to the last dimension of your tensor resulting in output with the shape (None, 129, 1). The Tensorflow docs mention also explain this behavior:

If the input to the layer has a rank greater than 2, then Dense computes the dot product between the inputs and the kernel along the last axis of the inputs and axis 0 of the kernel (using tf.tensordot).

You can set return_sequences to False if you want to work with a 2D output (batch_size, features) instead of 3D (batch_size, time_steps, features), or you can use the Flatten layer.