I am trying to implement an image stippling algorithm in python, and want to vectorize calculating the density (average luminance) of labelled image regions (Voronoi cells). Currently I'm able to do so using a loop, but this is too computationally intensive for large numbers of regions. How can I vectorize this operation?
import numpy as np
from skimage import io
from scipy.interpolate import griddata
number_of_points = 1000
img = io.imread('https://www.kindpng.com/picc/m/111-1114964_house-icon-png-old-house-easy-drawing-transparent.png', as_gray=True)
height, width = img.shape
# generate random points
rng = np.random.default_rng()
points = rng.random((number_of_points,2)) * [width, height]
# calculate labelled regions
grid_x, grid_y = np.mgrid[0:width, 0:height]
labels = griddata(points, np.arange(number_of_points), (grid_x, grid_y), method='nearest')
# calculate density per region (mean of grayscale values of pixels in each region)
point_idxs = np.arange(len(points))
density = [np.mean(img[labels.T==i]) for i in point_idxs] # <-- this is the bottleneck
CodePudding user response:
The problem is not the loop but the fact that this algorithm is not efficient. Using vectorization will use a lot of memory (which is slow) and barely speed up the loop. Indeed, img
is fully read len(point_idxs)
. It can be read once using np.add.at
and np.bincount
:
sumByLabel = np.zeros(np.max(labels) 1)
np.add.at(sumByLabel, labels.T, img)
countByLabel = np.bincount(labels.reshape(-1))
density = sumByLabel / countByLabel
This takes 32 ms on my machine while the initial code takes 539 ms (17x faster).