I have a 2D numpy array of zeros and ones. I need to compute the sum for each element starting from this element until a zero is reached. I need one array for summing column wise, and another array for summing row wise. Below is an example, but real arrays are huge since they are binary images. I hope to do this without loops.
inp = [[0, 1, 1, 1, 0, 1, 1, 0, 1],
[1, 1, 0, 0, 0, 1, 0, 1, 0],
[1, 1, 1, 0, 1, 1, 0, 1, 1]]
col = mysum_cols(inp)
col = [[0, 3, 2, 1, 0, 2, 1, 0, 1],
[2, 1, 0, 0, 0, 1, 0, 1, 0],
[3, 2, 1, 0, 2, 1, 0, 2, 1]]
row = mysum_rows(inp)
row = [[0, 3, 1, 1, 0, 3, 1, 0, 1],
[2, 2, 0, 0, 0, 2, 0, 2, 0],
[1, 1, 1, 0, 1, 1, 0, 1, 1]]
CodePudding user response:
It can indeed be done using some cumsum trickery. If you're willing to accept pandas:
import pandas as pd
inp = [[0, 1, 1, 1, 0, 1, 1, 0, 1],
[1, 1, 0, 0, 0, 1, 0, 1, 0],
[1, 1, 1, 0, 1, 1, 0, 1, 1]]
def do_sum(series):
groups = series.rsub(1).cumsum()
return series.groupby(groups).cumsum()
df = pd.DataFrame(inp).iloc[::-1, ::-1]
col = df.apply(do_sum, axis=1).iloc[::-1, ::-1].to_numpy()
row = df.apply(do_sum, axis=0).iloc[::-1, ::-1].to_numpy()