python replace value in array based on previous and following value in column-CodePudding

given the following array, I want to replace the zero with their previous value columnwise as long as it is surrounded by two values greater than zero. I am aware of np.where but it would consider the whole array instead of its columns. I am not sure how to do it and help would be appreciated.

This is the array:

a=np.array([[4, 3, 3, 2],
            [0, 0, 1, 2],
            [0, 4, 2, 4],
            [2, 4, 3, 0]])

and since the only zero that meets this condition is the second row/second column one, the new array should be the following

new_a=np.array([[4, 3, 3, 2],
               [0, 3, 1, 2],
               [0, 4, 2, 4],
               [2, 4, 3, 0]])

How do I accomplish this?

And what if I would like to extend the gap surrounded by nonzero ? For instance, the first column contains two 0 and the second column contains one 0, so the new array would be

new_a=np.array([[4, 3, 3, 2],
               [4, 3, 1, 2],
               [4, 4, 2, 4],
               [2, 4, 3, 0]])

In short, how do I solve this if the columnwise condition would be the one of having N consecutive zeros or less?

CodePudding user response：

As a generic method, I would approach this using a convolution:

from scipy.signal import convolve2d

# kernel for top/down neighbors
kernel = np.array([[1],
                   [0],
                   [1]])
# is the value a zero?
m1 = a==0
# count non-zeros neighbors
m2 = convolve2d(~m1, kernel, mode='same') > 1

mask = m1&m2

# replace matching values with previous row value
a[mask] = np.roll(a, 1, axis=0)[mask]

output:

array([[4, 3, 3, 2],
       [0, 3, 1, 2],
       [0, 4, 2, 4],
       [2, 4, 3, 0]])

filling from surrounding values

Using pandas to benefit from ffill/bfill (you can forward-fill in pure numpy but its more complex):

import pandas as pd
df = pd.DataFrame(a)

# limit for neighbors
N = 2

# identify non-zeros
m = df.ne(0)
# mask zeros
m2 = m.where(m)
# mask for values with 2 neighbors within limits
mask = m2.ffill(limit=N) & m2.bfill(limit=N)
df.mask(mask&~m).ffill()

array([[4, 3, 3, 2],
       [4, 3, 1, 2],
       [4, 4, 2, 4],
       [2, 4, 3, 0]])

CodePudding user response：

That's one solution I found. I know it's basic but I think it works.

a=np.array([[4, 3, 3, 2],
            [0, 0, 1, 2],
            [0, 4, 2, 4],
            [2, 4, 3, 0]])
a_t = a.T

for i in range(len(a_t)):
    ar = a_t[i]
    for j in range(len(ar)-1):
        if (j>0) and (ar[j] == 0) and (ar[j 1] > 0):
            a_t[i][j] = a_t[i][j-1]
a = a_t.T