Generating probabilites from patches of image-CodePudding

I am working with an image of size 512x512. The image is divided into patches using einops with patch size of 32. The number of patches overall is 256, in other words, we get a new "image" of size 256x1024.

Since this image is actually a mask for a segmentation problem, the image is actually comprised of only 4 values (4 classes): 0 for the background, 1 for the first class, 2 for the second class, 3 for the third class.

My goal is to take every patch, and compute for every class C the following:

Number of pixels in this patch / Number of pixels labeled C.

This should give me an array of size 4 where the first entry is the total number of pixels in the patch (1024) over the number of background pixels (labeled as 0), the second, third and fourth entries are the same but for the corresponding class.

In theory, I know that I need to iterate over every single patch and then count how many pixels of each class exists in the current patch, then divide by 1024. Doing this 256 yields exactly what I want. The problem is that I have a (very) large amount of images that I need to do this for, and the size of 512 is just an example to make the question simpler, therefore a for loop is out of question.

I know that I can get the result that I want using numpy. I tried both: numpy.apply_over_axes and numpy.apply_along_axis but I don't know which one is better suited for this task, also there is numpy.where which I don't know how it applies here.

Here is what I did:

from einops import rearrange
import numpy as np

labn = np.random.randint(4,size= (512,512))      # Every pixels in this image is of value: 0,1,2,3
to_patch = rearrange(labn, "(h p1) (w p2) -> (h w) (p1 p2)", p1=32, p2=32)
print(to_patch.shape) # (256,1024)

c0 = np.full(1024, 0)
c1 = np.full(1024, 1)
c2 = np.full(1024, 2)
c3 = np.full(1024, 3)

def f(a):
    _c0 = a == c0
    _c1 = a == c2
    _c2 = a == c2
    _c3 = a == c3
    pr = np.array([np.sum(_c0), np.sum(_c1), np.sum(_c2), np.sum(_c3)]) / 1024
    return pr

resf = np.apply_along_axis(f, 1, to_patch)
print(resf.shape) # (256, 4, 1024)

Two things:

I want the output to be 256x4 where every array along the second axes sums to one.
Is there a faster/better/pythonic way to do this, preferably vectorized?

EDIT: I forgot to add the sum, so now I do get 256x4.

CodePudding user response：

There is a built-in function to count occurrences called torch.histc, it is similar to Python's collections.Counter.

torch.histc(input, bins=100, min=0, max=0, *, out=None) → Tensor
Computes the histogram of a tensor.

The elements are sorted into equal width bins between min and max. If min and max are both zero, the minimum and maximum values of the data are used.

Elements lower than min and higher than max are ignored.

You need to specify the number of bins, here the number of classes C. As well as the min and max values for ordering. Also, it won't work with multi-dimensional tensors as such the resulting tensor will contain global statistics of the input tensor regardless of dimensions. As a possible workaround, you can iterate through your patches, calling torch.histc each time, then stacking the results and normalizing:

resf = torch.stack([torch.histc(patch, C, min=0, max=C-1) for patch in x]) / x.size(1)