Vectorised slicing of multiple rows in ndarray at arbitrary positions-CodePudding

I often find myself holding an array not of indices, but of index bounds that effectively define multiple slices. A representative example is

import numpy as np

rand = np.random.default_rng(seed=0)
sample = rand.integers(low=0, high=10, size=(10, 10))
y, x = np.mgrid[:10, :10]
bad_starts = rand.integers(low=0, high=10, size=(10, 1))
print(bad_starts)

sample[
    (x >= bad_starts) & (y < 5)
] = -1

print(sample)

[[4]
 [7]
 [3]
 [2]
 [7]
 [8]
 [0]
 [0]
 [6]
 [3]]
[[ 8  6  5  2 -1 -1 -1 -1 -1 -1]
 [ 6  9  5  6  9  7  6 -1 -1 -1]
 [ 2  8  6 -1 -1 -1 -1 -1 -1 -1]
 [ 8  1 -1 -1 -1 -1 -1 -1 -1 -1]
 [ 4  0  0  1  0  6  5 -1 -1 -1]
 [ 7  3  4  9  8  9  3  6  9  6]
 [ 8  6  7  3  8  1  5  7  8  5]
 [ 3  3  4  4  7  8  0  9  5  3]
 [ 6  5  2  3  7  5  5  3  7  3]
 [ 3  8  2  2  7  6  0  0  3  8]]

Is there a simpler way to accomplish the same thing with slices alone, avoiding having to call mgrid and avoiding an entire boolean predicate matrix?

CodePudding user response：

With ogrid you get 'sparse' grid

In [488]: y,x
Out[488]: 
(array([[0],
        [1],
        [2],
        [3],
        [4],
        [5],
        [6],
        [7],
        [8],
        [9]]),
 array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]))

The mask is the same: (x >= bad_starts) & (y < 5)

A single value for each row can be fetched (or set) with:

In [491]: sample[np.arange(5)[:,None],bad_starts[:5]]
Out[491]: 
array([[-1],
       [-1],
       [-1],
       [-1],
       [-1]])

But there isn't a way of accessing all -1 with simple slicing. Each row has a different length slice:

In [492]: [sample[i,bad_starts[i,0]:] for i in range(5)]
Out[492]: 
[array([-1, -1, -1, -1, -1, -1]),
 array([-1, -1, -1]),
 array([-1, -1, -1, -1, -1, -1, -1]),
 array([-1, -1, -1, -1, -1, -1, -1, -1]),
 array([-1, -1, -1])]

There isn't a way to access all with one slice.

The equivalent 'advanced indexing' arrays are:

In [494]: np.nonzero((x >= bad_starts) & (y < 5))
Out[494]: 
(array([0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3,
        3, 3, 4, 4, 4]),
 array([4, 5, 6, 7, 8, 9, 7, 8, 9, 3, 4, 5, 6, 7, 8, 9, 2, 3, 4, 5, 6, 7,
        8, 9, 7, 8, 9]))