How to replace repeated items in the row of my array with zeros-CodePudding

I am trying to write a code that replaces all rows of three or more continuous values for zeros. so the three threes on the first row should become zero. I wrote this code which in my mind should work but when I execute my code it seems to me that I am stuck in an infinite loop.

import numpy as np

A = np.array([[1, 2, 3, 3, 3, 4], 
              [1, 3, 2, 4, 2, 4], 
              [1, 2, 4, 2, 4, 4],
              [1, 2, 3, 5, 5, 5], 
              [1, 2, 1, 3, 4, 4]])

row_nmbr,column_nmbr = (A.shape)
row = 0
column = 0


while column < column_nmbr:
  next_col = column   1
  next_col2 = next_col   1
  if A[row][column] == A[row][next_col] and A[row][next_col] == A[row][next_col2]:
    A[row][column] = 0
    column =  1
print(A)

CodePudding user response：

Don't use if-else. It gets messy easily. Here's an approach without if-else.

Iterate over each row, and find unique element and their counts in it.
If an element occurs three or more times, filter that into an array.
Start iteration for each filtered element (val)
Find the indices of val in the given row
Do a groupby on the indices from step 4 to find blocks of contiguous indices.
Check if contiguous indices are three or more in number
If yes, do replacement.

The following sample code is scalable and works for multiple contiguous elements.

from functools import partial
from operator import itemgetter


A = np.array([[3, 3, 5, 3, 3, 3, 5, 5, 5, 6, 6, 5, 5, 5], 
              [1, 8, 8, 4, 7, 4, 7, 7, 7, 7, 1, 2, 3, 9],
              [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5],
              [1, 2, 3, 3, 3, 3, 3, 2, 1, 1, 1, 2, 2, 2], 
              [1, 2, 1, 3, 4, 4, 9, 8, 8, 8, 8, 9, 9, 8]])


def func1d(row, replacement):
    # find and filter elements which occurs three or more times
    vals, count = np.unique(row, return_counts=True)
    vals = vals[count >= 3]

    # Iteration for each filtered element (val)
    for val in vals:
        # get indices of val from row
        indices = (row == val).nonzero()[0]

        # find contiguous indices
        for k, g in groupby(enumerate(indices), lambda t: t[1] - t[0]):
            l = list(map(itemgetter(1), g))            
            # if grouped indices are three or more, do replacement
            if len(l) >=3:
                row[l] = replacement

    return row


wrapper = partial(func1d, replacement=0)
np.apply_along_axis(wrapper, 1, A)

Output, when compared with A:

# original array
[[3 3 5 3 3 3 5 5 5 6 6 5 5 5]
 [1 8 8 4 7 4 7 7 7 7 1 2 3 9]
 [1 1 1 2 2 2 3 3 3 4 4 4 4 5]
 [1 2 3 3 3 3 3 2 1 1 1 2 2 2]
 [1 2 1 3 4 4 9 8 8 8 8 9 9 8]]

# array with replaced values
[[3 3 5 0 0 0 0 0 0 6 6 0 0 0]
 [1 8 8 4 7 4 0 0 0 0 1 2 3 9]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 5]
 [1 2 0 0 0 0 0 2 0 0 0 0 0 0]
 [1 2 1 3 4 4 9 0 0 0 0 9 9 8]]

CodePudding user response：

Your loop will be infinite since column will always be 0 and less than column_nmbr.

Do it right like this:

for i in range(row_nmbr):
    m, k = np.unique(A[i], return_inverse=True)
    val = m[np.bincount(k) > 2]
    if len(val) > 0:
        aaa = A[i]
        aaa[A[i] == val] = 0


print(A)

Output:

[[1 2 0 0 0 4]
 [1 3 2 4 2 4]
 [1 2 0 2 0 0]
 [1 2 3 0 0 0]
 [1 2 1 3 4 4]]