I'm trying to use the Autoencoder which code you can see below as a tool for Dimensionality Reduction, I was wondering how can I "extract" the hidden layer and use it for my purpose
My original Dataset went under Standard Scaling
Here I define a Dictionary to centralize the values
CONFIG = {
'BATCH_SIZE' : 1024,
'LR' : 1e-4,
'WD' : 1e-8,
'EPOCHS': 50
}
Here I convert the values of my train and test dataframes into tensors
t_test = torch.FloatTensor(test.values)
t_train = torch.FloatTensor(train.values)
Here I create data loaders
loader_test = torch.utils.data.DataLoader(dataset = t_test,
batch_size = CONFIG['BATCH_SIZE'],
shuffle = True)
loader_train = torch.utils.data.DataLoader(dataset = t_train,
batch_size = CONFIG['BATCH_SIZE'],
shuffle = True)
Here I create the class AutoEncoder (AE)
class AE(torch.nn.Module):
def __init__(self):
super().__init__()
self.encoder = torch.nn.Sequential(
torch.nn.Linear(31,16),
torch.nn.ReLU(),
torch.nn.Linear(16, 8),
torch.nn.ReLU(),
torch.nn.Linear(8, 4),
)
self.decoder = torch.nn.Sequential(
torch.nn.Linear(4, 8),
torch.nn.ReLU(),
torch.nn.Linear(8, 16),
torch.nn.ReLU(),
torch.nn.Linear(16, 31),
)
def forward(self, x):
encoded = self.encoder(x)
decoded = self.decoder(encoded)
return decoded
Here I define model loss_funcion and the optimizer
model = AE()
loss_function = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(),
lr = CONFIG['LR'],
weight_decay = CONFIG['WD'])
Here I compute the algorithm
epochs = CONFIG['EPOCHS']
dict_list = []
for epoch in range(epochs):
for (ix, batch) in enumerate(loader_train):
model.train()
reconstructed = model(batch)
loss = loss_function(reconstructed, batch)
optimizer.zero_grad()
loss.backward()
optimizer.step()
temp_dict = {'Epoch':epoch,'Batch_N':ix,'Batch_L':batch.shape[0],'loss':loss.detach().numpy()}
dict_list.append(temp_dict)
df_learning_o = pd.DataFrame(dict_list)
CodePudding user response:
You can simply return not just the decoded output, but also the encoded embedding layer, like this:
class AE(torch.nn.Module):
def __init__(self):
super().__init__()
self.encoder = torch.nn.Sequential(
torch.nn.Linear(31,16),
torch.nn.ReLU(),
torch.nn.Linear(16, 8),
torch.nn.ReLU(),
torch.nn.Linear(8, 4),
)
self.decoder = torch.nn.Sequential(
torch.nn.Linear(4, 8),
torch.nn.ReLU(),
torch.nn.Linear(8, 16),
torch.nn.ReLU(),
torch.nn.Linear(16, 31),
)
def forward(self, x):
encoded = self.encoder(x)
decoded = self.decoder(encoded)
return encoded, decoded
When you pass something to your model (in the train loop for example), you would have to change it to the following:
encoded, reconstructed = model(batch)
Now you can do whatever you'd like with the encoded embedding, i.e. which is the dimensionally reduced input.