Home > Back-end >  Keep only sub-arrays with one unique value at position 0
Keep only sub-arrays with one unique value at position 0

Time:11-29

Starting from a Numpy nd-array:

>>> arr
[
    [
        [10, 4, 5, 6, 7],
        [11, 1, 2, 3, 4],
        [11, 5, 6, 7, 8]
    ],
    [
        [12, 4, 5, 6, 7],
        [12, 1, 2, 3, 4],
        [12, 5, 6, 7, 8]
    ],
    [
        [15, 4, 5, 6, 7],
        [15, 1, 2, 3, 4],
        [15, 5, 6, 7, 8]
    ],
    [
        [13, 4, 5, 6, 7],
        [13, 1, 2, 3, 4],
        [14, 5, 6, 7, 8]
    ],
    [
        [10, 4, 5, 6, 7],
        [11, 1, 2, 3, 4],
        [12, 5, 6, 7, 8]
    ]
]

I would like to keep only the sequences of 3 sub-arrays which have only one unique value at position 0, so as to obtain the following:

>>> new_arr
[
    [
        [12, 4, 5, 6, 7],
        [12, 1, 2, 3, 4],
        [12, 5, 6, 7, 8]
    ],
    [
        [15, 4, 5, 6, 7],
        [15, 1, 2, 3, 4],
        [15, 5, 6, 7, 8]
    ]
]

From the initial array, arr[0], arr[3] and arr[4] were discarded because they both had more than one unique value in position 0 (respectively, [10, 11], [13, 14] and [10, 11, 12]).

I tried fiddling with numpy.unique() but could only get to the global unique values at positon 0 within all sub-arrays, which is not what's needed here.

-- EDIT

The following seems to get me closer to the solution:

>>> np.unique(arr[0, :, 0])
array([10, 11])

But I'm not sure how to get one-level higher than this and put a condition on that for each sub-array of arr without using a Python loop.

CodePudding user response:

I got this to work without any transposing.

arr = np.array(arr)
arr[np.all(arr[:, :, 0] == arr[:, :1, 0], axis=1)]

CodePudding user response:

Inspired by an attempt to reply in the form of an edit to the question (which I rejected as it should have been an answer), here is something that worked:

>>> arr[(arr[:,:,0].T == arr[:,0,0]).T.all(axis=1)]
[
    [
        [12, 4, 5, 6, 7],
        [12, 1, 2, 3, 4],
        [12, 5, 6, 7, 8]
    ],
    [
        [15, 4, 5, 6, 7],
        [15, 1, 2, 3, 4],
        [15, 5, 6, 7, 8]
    ]
]

The trick was to transpose the results so that:

# all 0-th positions of each subarray
arr[:,:,0].T

# the first 0-th position of each subarray 
arr[:,0,0]

# whether each 0-th position equals the first one
(arr[:,:,0].T == arr[:,0,0]).T

# keep only the sub-array where the above is true for all positions
(arr[:,:,0].T == arr[:,0,0]).T.all(axis=1)

# lastly, apply this indexing to the initial array
arr[(arr[:,:,0].T == arr[:,0,0]).T.all(axis=1)]
  • Related