I have a classic dataset of images and labels.
Here is a simple representation of the __getitem__
function :
def __getitem__(self, index):
(img_path, label) = df.iloc[index].values
img = Image.open(img_path).convert("RGB")
y = torch.tensor(labels))
return (img, y)
I have :
dataset = ClassDataset()
train_set, validation_set = random_split(dataset)
train_loader = DataLoader(dataset=train_set)
The size of one batch of the train loader would be : [32,3,256,256]
With 32 being the batch size, 3 the number of channels and 256 the width and height of my image.
I want to modify the shape of one batch so that it is sequential [8,4,3,256,256]
with 8 being the batch size and 4 the length of one sequence.
I know that it could be easily done with torch.view()
or torch.reshape()
knowing that my data are already in the right order (they can be grouped directly into sequences).
But I want to know where is the most intelligent place to make this change, in the dataset class, in the dataloader class or in the train loop.
I already tried passing sequences into the getitem :
(img_path, coords) = df.iloc[4*(index-1):4*index].values
(assuming that sequence length is 4), but it didn't work.
CodePudding user response:
It is more relevant to do this kind of processing in the dataset layer. Indeed, what you are looking to implement there is "given a dataset index index
return the corresponding input and its label". In your case you are dealing with a sequence as input, so something like this makes sense for your __getitem__
to return a sequence of images.
The data loader will automatically collate the data such that you get (batch_size, seq_len, channel, height, width)
for your input, and (batch_size, seq_len)
for your label (or (batch_size,)
if there is meant to be a single label per sequence).