Home > other >  Saving tensors to a .pt file in order to create a dataset
Saving tensors to a .pt file in order to create a dataset

Time:02-27

I was tasked with the creation of a dataset to test the functionality of the code we're working on.

The dataset must have a group of tensors that will be used later on in a generative model.

I'm trying to save the tensors to a .pt file, but I'm overwriting the tensors thus creating a file with only one. I've read about torch.utils.data.dataset but I'm not able to figure out by my own how to use it.

Here is my code:

import torch
import numpy as np

from torch.utils.data import Dataset

#variables that will be used to create the size of the tensors:
num_jets, num_particles, num_features = 1, 30, 3


for i in range(100):
    #tensor from a gaussian dist with mean=5,std=1 and shape=size:
    tensor = torch.normal(5,1,size=(num_jets, num_particles, num_features)) 

    #We will need the tensors to be of the cpu type
    tensor = tensor.cpu()

    #save the tensor to 'tensor_dataset.pt'
    torch.save(tensor,'tensor_dataset.pt')


#open the recently created .pt file inside a list
tensor_list = torch.load('tensor_dataset.pt')

#prints the list. Just one tensor inside .pt file
print(tensor_list)

CodePudding user response:

Reason: You overwrote tensor x each time in a loop, therefore you did not get your list, and you only had x at the end.

Solution: you have the size of the tensor, you can initialize a tensor first and iterate through lst_tensors:

import torch
import numpy as np
from torch.utils.data import Dataset

num_jets, num_particles, num_features = 1, 30, 3

lst_tensors = torch.empty(size=(100,num_jets, num_particles, num_features))

for i in range(100):

    lst_tensors[i] = torch.normal(5,1,size=(num_jets, num_particles, num_features)) 

    lst_tensors[i] = lst_tensors[i].cpu()


torch.save(lst_tensors,'tensor_dataset.pt')

tensor_list = torch.load('tensor_dataset.pt')

print(tensor_list.shape)   # [100,1,30,3]
  • Related