I have been working on a task, where I implemented median cut for image quantization – representing the whole image by only limited set of pixels. I implemented the algorithm and now I am trying to implement the part, where I assign each pixel to a representant from the set found by median cut. So, I have variable 'color_space', which is 2d ndarray of shape (n,3), where n is the number of representatives. Then I have variable 'img', which is the original image of shape (rows, columns, 3).
Now I want to find the nearest pixel (bin) for each pixel from the image based on euclidean distance. I was able to come with this solution:
for row in range(img.shape[0]):
for column in range(img.shape[1]):
img[row][column] = color_space[np.linalg.norm(color_space - img[row][column], axis=1).argmin()]
What it does is, that for each pixel from the image, it computes the vector if distances from each of the bins and then it takes the closest one. Problem is, that this solution is quite slow and I would like to vectorize it - instead of getting vector for each pixel, I would like to get a matrix, where for example first row would be the first vector of distances computed in my code etc...
This problem could be converted into a problem, where I want to do a matrix multiplication, but instead of getting dot product of two vectors, I would get their euclidean distance. Is there some good approach to such problems? Some general solution in numpy, if we want to do 'matrix multiplication' in numpy, but the function Rn x Rn -> R does not need to be dot product, but for example euclidean distance. Of course, for the multiplication, the original image should be resized to (row*columns, 3), but that is a detail.
I have been studying the documentation and searching internet, but didn't find any good approach.
Please note that I don't want others to solve my assignment, the solution I came up with is totally ok, I am just curious whether I could speed it up as I try to learn numpy properly.
Thanks for any advices!
CodePudding user response:
Below is MWE for vectorizing your problem. See comments for explanation.
import numpy
# these are just random array declaration to work with.
image = numpy.random.rand(32, 32, 3)
color_space = numpy.random.rand(10,3)
# your code. I modified it to pick indexes
result = numpy.zeros((32,32))
for row in range(image.shape[0]):
for column in range(image.shape[1]):
result[row][column] = numpy.linalg.norm(color_space - image[row][column], axis=1).argmin()
result = result.astype(numpy.int)
# here we reshape for broadcasting correctly.
image = image.reshape(1,32,32,3)
color_space = color_space.reshape(10, 1,1,3)
# compute the norm on last axis, which is RGB values
result_norm = numpy.linalg.norm(image-color_space, axis=3)
# now compute the vectorized argmin
result_vectorized = result_norm.argmin(axis=0)
print(numpy.allclose(result, result_vectorized))
Eventually, you can get the correct solution by doing color_space[result]
. You may have to remove the extra dimensions that you add in color space to get correct shapes in this final operation.
CodePudding user response:
I think this approach might be a bit more numpy-ish/pythonic:
import numpy as np
from typing import *
from numpy import linalg as LA
# assume color_space is defined as a constant somewhere above and is of shape (n,3)
nearest_pixel_idxs: Callable[[np.ndarray], int] = lambda rgb: return LA.norm(color_space - rgb, axis=1).argmin()
img: np.ndarray = color_space[np.apply_along_axis(nearest_pixel_idxs, 1, img.reshape((-1, 3)))]
Why this solution might be more efficient:
- It relies on the parallelizable
apply_along_axis
functionnearest_pixel_idxs()
rather than the nested for-loops. This is made possible by reshapingimg
and thereby removing the need for double indexing. - It avoids repeated writes into
color_space
by only indexing into it once at the very end.
Let me know if you would like me to go into greater depth on any of this - happy to help.
CodePudding user response:
You could first broadcast to get all the combinations and then calculate each norm. You could then pick the smallest from there.
a = np.array([[1,2,3],
[2,3,4],
[3,4,5]])
b = np.array([[1,2,3],
[3,4,5]])
a = np.repeat(a.reshape(a.shape[0],1,3), b.shape[0], axis = 1)
b = np.repeat(b.reshape(1,b.shape[0],3), a.shape[0], axis = 0)
np.linalg.norm(a - b, axis = 2)
Each row of the result represents the distance of the row in a
to each of the representatives in b
array([[0. , 3.46410162],
[1.73205081, 1.73205081],
[3.46410162, 0. ]])
You can then use argmin
to get the final results.
IMO it is better to use (what @Umang Gupta proposed) numpy's automatic broadcasting than using repeat
.