Home > OS >  sklearn patchextractor...missing elements
sklearn patchextractor...missing elements

Time:05-17

I am toying around with using python to apply various image-kernels to images; I am using sklearn.feature_extraction to create the patches, however, when I do so, it appears some of the data is missing, which will cause problems when I go back to reconstruct the image. Am I doing something wrong, or do I have to add a buffer around the image when grabbing the patches for border cases?

from PIL import Image
sklearn.feature_extraction import image
import numpy as np

img = Image.open('a.png')
arr = np.array(img)
patches = imagePatchExtractor(patch_size=(3,3)).fit(arr).transform(arr)

>>>arr.shape
(1080, 1080, 3)
>>>patches.shape
(1164240, 3, 3)
>>>1164240/1080
1078.0

CodePudding user response:

There are two things to understand here:

  1. image.PatchExtractor extracts all possible patches with strides of 1 in each dimension. For example, with patches of shape (3, 3) you will get arr[0:3, 0:3, 0], then arr[1:4, 1:4, 0], and so on. Hence, in general, for a patch size of (x, y) and image size of (w, h) you will get (w-x 1)*(h-y 1) many patches for each channel. The -x 1 and -y 1 is due to the patch hitting the image boundaries (there is no padding).

  2. PatchExtractor.transform() expects the first dimension to be the n_samples. So, in your case the shape should be (1, 1080, 1080, 3).

Putting all this together, here is an example with a fake smaller image with one channel:

from sklearn.feature_extraction import image
import numpy as np

# Adding the n_samples dimension with reshape.
arr = np.arange(0, 6*6*1).reshape((1, 6, 6))
print(arr)
array([[[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11],
        [12, 13, 14, 15, 16, 17],
        [18, 19, 20, 21, 22, 23],
        [24, 25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34, 35]]])
# Get all possible patches.
patches = image.PatchExtractor(patch_size=(3, 3)).fit(arr).transform(arr)
print(np.shape(patches))
print(patches[0, :])
print(patches[1, :])
shape: 
   # (6-3 1) * (6-3 1) = 16
   (16, 3, 3)

patches[0, :]:
   array([[ 0.,  1.,  2.],
          [ 6.,  7.,  8.],
          [12., 13., 14.]])

patches[1, :]:
   array([[ 1.,  2.,  3.],
          [ 7.,  8.,  9.],
          [13., 14., 15.]])

As you can see, the result matches the explanation above. Patch 1 is displaced by one pixel to the right with respect to patch 2.

Hence, in your case with an image of shape (1080, 1080, 3):

# You also need this reshape to add the n_samples dimension.
arr = np.arange(0, 1080*1080*3).reshape((1, 1080, 1080, 3))
patches = image.PatchExtractor(patch_size=(3, 3)).fit(arr).transform(arr)
print(np.shape(patches))
# (1080-3 1)*(1080-3 1) = 1162084
(1162084, 3, 3, 3)

EDIT - patches with padding:

If you want to have the same number of patches for each pixel you could pad the image using np.pad(). Note that by default it pads all the axes, so we need to manually specify the pad amounts per-axis:

# Padding amount for each axis. Here: amount should be patch_size-1.
# Here, the format is (pad_before, pad_after) for each dimension.
paddings = ((0, 0), (1, 1), (1, 1), (0, 0))
wrapped_arr = np.pad(arr, pad_width=paddings, mode='wrap')
wrapped_patches = image.PatchExtractor(patch_size=(3, 3)).fit(wrapped_arr).transform(wrapped_arr)

print(np.shape(wrapped_patches))
# 1080*1080 = 1166400
(1166400, 3, 3, 3)
  • Related