Home > Software design >  NumPy - Remove subarrays containing identical elements
NumPy - Remove subarrays containing identical elements

Time:03-23

Suppose I have the following 4 by 3 by 3 array,

array([[[-2, -2, -2],
        [-2, -2, -2],
        [-2, -2, -2]],

       [[-2, -2, -2],
        [-2, -2, -2],
        [-2, -2, -2]],

       [[-2, -2, 71],
        [-1, -1, -1],
        [71, -1, 52]],

       [[-2, -2, -2],
        [-2, -2, -2],
        [-2, -2, -2]]])

I would like to filter such array by the following standard:

Treat each 3 by 3 array as a block. If all elements in this block are equal to -2, we should cut the whole block, so the target array would look like this (1 by 3 by 3):

array([[[-2, -2, 71],
        [-1, -1, -1],
        [71, -1, 52]]])

I could only come up with a brute force solution with an explicit if condition and a for loop, but it does not work. Can anyone share a better method?

You can recreate the original array by the following commands

array = np.array([[-2,-2,-2,-2,-2,-2,-2,-2,-2],
    [-2,-2,-2,-2,-2,-2,-2,-2,-2],
    [-2,-2,71,-1,-1,-1,71,-1,52],
    [-2,-2,-2,-2,-2,-2,-2,-2,-2]])
newarr = array.reshape(4,3,3)

CodePudding user response:

The other answers are nice if you expect your arrays along axis 0 to always be a specific value, like -2, or 872385, etc.

In case you want something more general where you want to filter out any arrays which contain a single value, you can filter arrays by rank.

Since any matrix of a single value will have rank 1, you can filter by rank != 1:

In [2]: x[np.linalg.matrix_rank(x) != 1]
Out[2]:
array([[[-2, -2, 71],
        [-1, -1, -1],
        [71, -1, 52]]])

This will work for any matrix along axis 0 which is filled with the same values. Another example:

In [4]: x
Out[4]:
array([[[ 5,  5,  5],
        [ 5,  5,  5],
        [ 5,  5,  5]],

       [[-2, -2, -2],
        [-2, -2, -2],
        [-2, -2, -2]],

       [[-2, -2, 71],
        [-1, -1, -1],
        [71, -1, 52]],

       [[99, 99, 99],
        [99, 99, 99],
        [99, 99, 99]]])

In [5]: x[np.linalg.matrix_rank(x) != 1]
Out[5]:
array([[[-2, -2, 71],
        [-1, -1, -1],
        [71, -1, 52]]])

CodePudding user response:

You can do this, it uses the filter array feature and the all() function in NumPy :

filterarr = []
for arr in newarr:
    if np.all(arr==-2):
        filterarr.append(False)
    else:
        filterarr.append(True)

newarr = newarr[filterarr]
print(newarr)

CodePudding user response:

You can use np.where:

data[np.where(np.any(data != -2, axis=-1))]

With the given array as data, the output is:

[[-2 -2 71]
 [-1 -1 -1]
 [71 -1 52]]

CodePudding user response:

You can just select for the negation of where all the array along the 2nd and 3rd axes are equal to -2:

>>> arr
array([[[-2, -2, -2],
        [-2, -2, -2],
        [-2, -2, -2]],

       [[-2, -2, -2],
        [-2, -2, -2],
        [-2, -2, -2]],

       [[-2, -2, 71],
        [-1, -1, -1],
        [71, -1, 52]],

       [[-2, -2, -2],
        [-2, -2, -2],
        [-2, -2, -2]]])
>>> ~(arr == -2).all(axis=(1,2))
array([False, False,  True, False])

And use boolean indexing on the first axis:

>>> arr[~(arr == -2).all(axis=(1,2)), ...]
array([[[-2, -2, 71],
        [-1, -1, -1],
        [71, -1, 52]]])
  • Related