Home > database >  Creating new observations in pytorch ImageFolder
Creating new observations in pytorch ImageFolder

Time:10-24

I am new to pythorch and what I would like to do will probably be easy, but I have not found anything online regarding actually increasing the number of observations without adding them into the image (in my case) folder. I don't want to add images to the folder because I want to play around with different transformations and see what is best without deleting images all the time. So what I do is:

trf = transforms.Compose([
    transforms.ToTensor(),
    transforms.RandomRotation(degrees=45),
    transforms.Grayscale(num_output_channels=1),
    transforms.Normalize(0, 1),
    transforms.functional.invert
])
train_data = torchvision.datasets.ImageFolder(root='./splitted_data/train', transform= trf)
print(len(train_data))
train = DataLoader(train_data, batch_size= batch_size, shuffle= True,  num_workers= os.cpu_count())

Here the output will be the same as the number of images in all folders, which means that transformations were applied to the existing observations, but this is not something that I want to achieve. I want each transformation to be a separate copy. How can I do that?

CodePudding user response:

You can implement a transform wrapper that will apply transforms sequentially and output every single transform combination. The issue with Torchvision's random transform is that the parameters are sampled when the transform is called. This makes it difficult to reproduce identical transformations. One alternative is to stack or concatenate all the images and apply the transform once on that stack.

I divided the transformation pipeline in three sections: the preprocessing and post-processing transform (the latter should not be stochastic since it is applied separately). As for the main transforms, they are the list of transforms you want to create combinations from, here RandomRotation and Grayscale.

Be aware, this solution has limitations when working with transforms that affect the channel number such as Grayscale. Generally, you want to keep the same tensor dimensions otherwise your concatenations and/or stacks will fail.

Here is a possible solution:

class Combination(nn.Module):
    def __init__(self, transforms, pre, post):
        super().__init__()
        self.transforms = transforms
        self.pre = T.Compose(pre)
        self.post = T.Compose(post)

    def stacked_t(self, t, x):
        lengths = [len(o) for o in x]
        return t(torch.cat(x)).split(lengths)

    def forward(self, x):
        out = [self.pre(x)[None]]
        for t in transforms:
            out  = self.stacked_t(t, out) # <- for every transform `t` we double
                                          #    the number of instances in` out`
        out = [self.post(o)[0] for o in out]
        return out

Here is an example usage with an input image:

>>> img

enter image description here

Initialize the transform combination:

>>> t = Combination(pre=[T.ToTensor()],
...                 post=[T.Normalize(0, 1),
...                       T.functional.invert],
...                 transforms=[T.RandomRotation(degrees=45),
...                             T.Grayscale(num_output_channels=1)])

Here is a preview of the different transform combinations:

>>> img_ = t(img)
img_[0] img_[1] img_[2] img_[3]
  • Related