I am trying to convert a nested loop over a numpy array into a numpy-optimized implementation.
The function being called inside the loop takes a 4D vector and a separate parameter, and outputs a 4D vector which is supposed to replace the old 4D vector based on operations with the new value. If relevant, the function a Welford online update which updates mean and standard deviation based on a new value, with the 4D vector being [old_mean, old_std, old_s, num_values]
. For each pixel channel, I am saving these values in the history_array
for updating the distribution based on future pixel values.
My present code looks like this:
def welford_next(arr:np.ndarray, new_point:np.float32) -> np.ndarray:
old_mean, _, old_s, num_points = arr
num_points = 1
new_mean = old_mean (new_point - old_mean) / num_points
new_s = old_s (new_point - old_mean) * (new_point - new_mean)
return [new_mean, np.sqrt(new_s / num_points) if num_points > 1 else new_s, new_s, num_points]
updates = [10., 20., 30., 40., 90., 80.]
history_array = np.zeros(shape = b.shape (4,)) # shape: [6,3,3,4]
print(f'History Shape: {history_array.shape}')
history_array_2 = np.zeros_like(history_array)
for update in updates:
image = np.empty(shape = b.shape) # shape: [6,3,3] (h x w x c)
image.fill(update)
for i, row in enumerate(image): # Prohibitively expensive
for j, col in enumerate(row):
for k, channel in enumerate(col):
history_array[i][j][k] = welford_next(history_array[i][j][k], channel)
history_array_2 = np.apply_along_axis(welford_next, axis=2, arr=history_array_2)
print(history_array == history_array_2)
However, the np.apply_along_axis()
is not seem to be viable because it does not allow additional parameters to be passed alongside the array itself.I also came across np.ufunc
which the welford_next()
function can be converted to using np.frompyfunc()
but it is unclear how it could help me reach the desired target.
How do I achieve this looped operation using numpy?
CodePudding user response:
The numpy optimized way to do this would be to change the way we use the welford_next()
function. As mentioned in the comments, repeated calls to a function cannot be optimized, thus the function call needs to be limited to once per frame and optimization needs to be done inside the function itself. The following implementation works ~ 50x faster.
def welford(history:np.ndarray, frame:np.ndarray) -> np.ndarray:
old_mean, _, old_s, num_points = np.transpose(history, [3,0,1,2])
num_points = 1.
new_mean = old_mean (frame - old_mean) / num_points
new_s = old_s (frame - old_mean) * (frame - new_mean)
new_std = np.sqrt(new_s / num_points) if num_points[0][0][0] > 1 else new_s
return np.transpose(np.array([new_mean, new_std, new_s, num_points]), [1,2,3,0])
updates = [10., 20., 30., 40., 90., 80.]
history_array = np.zeros(shape = b.shape (4,)) # shape: [6,3,3,4]
for update in updates:
image = np.empty(shape = b.shape) # shape: [6,3,3] (h x w x c)
image.fill(update)
history_array = welford(history_array, image)