I was recently given task (during exam, not funny) to create function returning cumulative sum along given dimension (input: 2d array), without use of np.cumsum ofc; to be honest i find this quite hard to even start with.
function should look like this:
def cumsum_2d(array : np.ndarray, dim : int = 0) -> np.ndarray:
and then result is supposed to be compared with result from actual np.cumsum
I would be grateful for even basic outline or general idea what to do.
CodePudding user response:
This approach is fairly "from scratch". It does use functools.reduce(), which I assume must be permitted.
import functools
import numpy as np
def cumsum_2d(array: np.ndarray, dim: int = 0) -> np.ndarray:
if not isinstance(dim, int) or not 0 <= dim <= 1:
raise ValueError('"dim": expected integer 0 or 1, got {dim}.')
elif not array.ndim == 2:
raise ValueError(
f"{array.ndim} dimensional array not allowed - 2 dimensional arrays expected."
)
array = array.T if dim == 1 else array
result = [
functools.reduce(lambda x, y: x y, array[: i 1]) for i in range(len(array))
]
result = np.array(result)
result = result.T if dim == 1 else result
return result
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dim = 1
print(f"For dim = {dim} and a= \n{a}:")
print(f"...got: \n{cumsum_2d(a, dim)}")
print(f"...expected: \n{np.cumsum(a, dim)}")
This has the result:
# For dim = 1 and a=
# [[1 2 3]
# [4 5 6]
# [7 8 9]]:
# ...got:
# [[ 1 3 6]
# [ 4 9 15]
# [ 7 15 24]]
# ...expected:
# [[ 1 3 6]
# [ 4 9 15]
# [ 7 15 24]]
Trying with dim = 1 raises ValueError per the function definition - this mimics the AxisError raised by np.cumsum under similar circumstances:
ValueError: "dim": expected integer 0 or 1, got 2.
Lastly, trying with a non 2-D array also raises an customised ValueError as programmed, ensuring the user doesn't get any silently passed unexpected behaviour.
b = np.array([[[1, 2, 3], [1, 2, 3]], [[4, 5, 6], [1, 2, 3]], [[7, 8, 9], [1, 2, 3]]])
cumsum_2d(b, dim)
Result:
ValueError: 3 dimensional array not allowed - 2 dimensional arrays expected.
CodePudding user response:
Here is another approach that doesn't use ufunc.accumulate
or functools.reduce
.
It works by inserting an extra dimension, broadcasting the array along that dimension, and then doing a sum where it only considers indices less than or equal to the current index along the summation dimension.
It's morally similar to a brute-force approach where you make a bunch of copies of the array, set the elements you don't want to zero, and then doing the sum.
import numpy as np
def cumsum_2d(array: np.ndarray, dim: int = 0):
# Make sure the dim argument is positive
dim = dim % array.ndim
# Calculate the new shape with an extra copy of dim
shape_new = list(array.shape)
shape_new.insert(dim 1, array.shape[dim])
# Insert the new dimension and broadcast the array along that dimension
array = np.broadcast_to(np.expand_dims(array, dim 1), shape_new)
# Save the indices of the array
indices = np.indices(array.shape)
# Sum along the requested dimension, considering only the elements less than the current index
return np.sum(array, axis=dim, where=indices[dim] <= indices[dim 1])
a = np.random.random((4, 5))
assert np.array_equal(cumsum_2d(a, 1), np.cumsum(a, 1))
assert np.array_equal(cumsum_2d(a, 0), np.cumsum(a, 0))
assert np.array_equal(cumsum_2d(a, -1), np.cumsum(a, -1))
assert np.array_equal(cumsum_2d(a, -2), np.cumsum(a, -2))
Note that this function should work for arrays of any rank, not just two-dimensional ones.