I am trying to write a code that replaces all rows of three or more continuous values for zeros. so the three threes on the first row should become zero. I wrote this code which in my mind should work but when I execute my code it seems to me that I am stuck in an infinite loop.
import numpy as np
A = np.array([[1, 2, 3, 3, 3, 4],
[1, 3, 2, 4, 2, 4],
[1, 2, 4, 2, 4, 4],
[1, 2, 3, 5, 5, 5],
[1, 2, 1, 3, 4, 4]])
row_nmbr,column_nmbr = (A.shape)
row = 0
column = 0
while column < column_nmbr:
next_col = column 1
next_col2 = next_col 1
if A[row][column] == A[row][next_col] and A[row][next_col] == A[row][next_col2]:
A[row][column] = 0
column = 1
print(A)
CodePudding user response:
Don't use if-else. It gets messy easily. Here's an approach without if-else.
- Iterate over each row, and find unique element and their counts in it.
- If an element occurs three or more times, filter that into an array.
- Start iteration for each filtered element (val)
- Find the indices of val in the given row
- Do a groupby on the indices from step 4 to find blocks of contiguous indices.
- Check if contiguous indices are three or more in number
- If yes, do replacement.
The following sample code is scalable and works for multiple contiguous elements.
from functools import partial
from operator import itemgetter
A = np.array([[3, 3, 5, 3, 3, 3, 5, 5, 5, 6, 6, 5, 5, 5],
[1, 8, 8, 4, 7, 4, 7, 7, 7, 7, 1, 2, 3, 9],
[1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5],
[1, 2, 3, 3, 3, 3, 3, 2, 1, 1, 1, 2, 2, 2],
[1, 2, 1, 3, 4, 4, 9, 8, 8, 8, 8, 9, 9, 8]])
def func1d(row, replacement):
# find and filter elements which occurs three or more times
vals, count = np.unique(row, return_counts=True)
vals = vals[count >= 3]
# Iteration for each filtered element (val)
for val in vals:
# get indices of val from row
indices = (row == val).nonzero()[0]
# find contiguous indices
for k, g in groupby(enumerate(indices), lambda t: t[1] - t[0]):
l = list(map(itemgetter(1), g))
# if grouped indices are three or more, do replacement
if len(l) >=3:
row[l] = replacement
return row
wrapper = partial(func1d, replacement=0)
np.apply_along_axis(wrapper, 1, A)
Output, when compared with A:
# original array
[[3 3 5 3 3 3 5 5 5 6 6 5 5 5]
[1 8 8 4 7 4 7 7 7 7 1 2 3 9]
[1 1 1 2 2 2 3 3 3 4 4 4 4 5]
[1 2 3 3 3 3 3 2 1 1 1 2 2 2]
[1 2 1 3 4 4 9 8 8 8 8 9 9 8]]
# array with replaced values
[[3 3 5 0 0 0 0 0 0 6 6 0 0 0]
[1 8 8 4 7 4 0 0 0 0 1 2 3 9]
[0 0 0 0 0 0 0 0 0 0 0 0 0 5]
[1 2 0 0 0 0 0 2 0 0 0 0 0 0]
[1 2 1 3 4 4 9 0 0 0 0 9 9 8]]
CodePudding user response:
Your loop will be infinite since column will always be 0 and less than column_nmbr.
Do it right like this:
for i in range(row_nmbr):
m, k = np.unique(A[i], return_inverse=True)
val = m[np.bincount(k) > 2]
if len(val) > 0:
aaa = A[i]
aaa[A[i] == val] = 0
print(A)
Output:
[[1 2 0 0 0 4]
[1 3 2 4 2 4]
[1 2 0 2 0 0]
[1 2 3 0 0 0]
[1 2 1 3 4 4]]