i am bit confused about the output shape of keras layer. I have created a sample keras model and also displayed its summary.
numberOfLSTMcells=1
n_timesteps_in=129
n_features=61
inp =Input(shape=(n_timesteps_in, n_features))
lstm= LSTM(numberOfLSTMcells,return_sequences=True, return_state=False) (inp)
fc=Dense(64,activation='relu',name='hidden_layer')(lstm)
out=Dense(1,activation='sigmoid',name='last_layer')(fc)
model = Model(inputs=inp, outputs=out)
Summary of model
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 129, 61)] 0
_________________________________________________________________
lstm_2 (LSTM) (None, 129, 1) 252
_________________________________________________________________
hidden_layer (Dense) (None, 129, 64) 128
_________________________________________________________________
last_layer (Dense) (None, 129, 1) 65
=================================================================
Total params: 445
Trainable params: 445
Non-trainable params: 0
What i think the shape of last layer should be (None,64,1)
. Becuase hidden_layers has 64 neurons which goes as input to last_layer
CodePudding user response:
Since you set the parameter return_sequences
to True
in the LSTM
layer, you are getting a sequence with the same number of time steps as your input and an output space of 1 for each timestep, hence the shape (None, 129, 1)
. Afterwards, you apply a Dense
layer to this tensor, but this layer is always applied to the last dimension of a tensor, which in your case is 1 and not 129. Therefore you get the output (None, 129, 64)
. Then, you use a final output layer, which is also applied to the last dimension of your tensor resulting in output with the shape (None, 129, 1)
. The Tensorflow docs mention also explain this behavior:
If the input to the layer has a rank greater than 2, then Dense computes the dot product between the inputs and the kernel along the last axis of the inputs and axis 0 of the kernel (using tf.tensordot).
You can set return_sequences
to False
if you want to work with a 2D output (batch_size, features)
instead of 3D (batch_size, time_steps, features)
, or you can use the Flatten
layer.