How to connect a LSTM layer to a Linear layer in Pytorch-CodePudding

I am doing a classification task of MFCC (time-series data) using LSTM.

I have input (16,60,40) (Batch,step,features)

class model(nn.Module):
  def __init__(self,ninp,num_layers,class_num,nhid=128):
      super().__init__()
      
      self.lstm_nets = nn.LSTM(input_size=ninp,hidden_size=nhid,num_layers=num_layers,
      batch_first=True,dropout=0.2,bidirectional=False)
      self.FC = nn.Linear(nhid,class_num)
      self.tanh = nn.Tanh()
      self.softmax = nn.LogSoftmax(1)
      
  def forward(self,X):
      
      device = 'cuda:0'
      out, (ht, ct) = self.lstm_nets(X)

      # out = ht.contiguous().view(16,-1)
   
      out = self.tanh(out)
     
      out = self.FC(out)
      
      Out = self.softmax(out)
      return Out

model = model(ninp=X.shape[2],num_layers=1,class_num=32,nhid=128)
loss_function = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.5e-4)

If I have out = ht.contiguous().view(16,-1) that flattens the LSTM output, I got error

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-96-a7e2ba68dcd9> in <module>()
     11 
     12         optimizer.zero_grad()
---> 13         y_pred = model(X)
     14         # calculate loss function
     15         loss = loss_function(y_pred, y)

3 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/linear.py in forward(self, input)
    101 
    102     def forward(self, input: Tensor) -> Tensor:
--> 103         return F.linear(input, self.weight, self.bias)
    104 
    105     def extra_repr(self) -> str:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (16x32 and 128x32)

If I have out = out.contiguous().view(16,-1) that flattens the LSTM output, I got error RuntimeError: mat1 and mat2 shapes cannot be multiplied (16x7680 and 128x32)

If I remove the Flatten Step, I got such an error. RuntimeError: Expected target size [16, 32], got [16] In addition, I found examples online do not flatten the output of LSTM

Thanks for any help.

CodePudding user response：

In each timestep of an LSTM the input goes through a simple neural network and the output gets passed to the next timestep. The output out of function

out, (ht, ct) = self.lstm_nets(X)

contains a list of ALL outputs (i.e the output of the neural networks of every timestep). Yet, in classification, you mostly only really care about the LAST output. You can get it like this:

out = out[:, -1]

This output has now the shape (hidden-size, 1). So in your case your forward function should look like this:

 def forward(self,X):
      
      device = 'cuda:0'
      out, (ht, ct) = self.lstm_nets(X)
      out = out[: ,-1]
   
      out = self.tanh(out)
     
      out = self.FC(out)
      
      Out = self.softmax(out)
      return Out