I have a X_train with size of (2, 100). I want to use the 250 first of the data, and use the second 250 of this matrix as the input of embedding and convert that to a matrix with size 2*3.
I read a lot about the embedding layer in pytorch, however I did not understand it well. I don't know how to get a 2*3 as the output of the embedding layer. Could you please help me with that? Here is a simple example.
import torch
import torch.nn as nn
X_train = np.random.randint(10, size = (2, 100))
X_train_notmbedding = X_train[:, 0:50] # not embedding (2,50)
X_train_mbedding = X_train[:, 50:100] #embedding (2, 50)
X_train_mbedding = torch.LongTensor([X_train_mbedding])
embedding = nn.Embedding(50, 3)
embeding_output = embedding(X_train_mbedding) # I want to get a embedding output as (2,3)
#X_train_new = torch.cat([X_train_notmbedding, embeding_output], 1) # here I want to build a matrix with size (2, 53)
CodePudding user response:
From the discussion, it looks like your understanding of Embeddings is not accurate.
- Only use 1 Embedding for 1 feature. In your example you are combining dates, ids etc. in 1 Embedding. Even in the medium article, they are using separate embeddings.
- Think of Embedding as one-hot encoding on steroids (less memory, data co-relation etc.). If you do not understand one-hot encoding I would start there first.
- KWH is already a real value not categorical. Use it as a linear input to the network (after normalization).
- ID: I do not know what ID denotes in your data, if it is a unique ID for each datapoint, it is not useful and should be excluded.
If the above does not make sense, I would start with a simple network using LSTM and make it work first before using an advanced architecture.