Numpy Masking with a multidimensional list of indexes-CodePudding

If I have arrays

data = np.array([[0.75, 0.05, 0.1, 0.2],
                 [0.4, 0.3, 0.2, 0.1]])

labels = np.array([3,1])

How can I overlay the indices over the data such that I get this result?

np.array([0.2, 0.3])

In other words the index of the labels array matches with the data array and the value for each indices corresponds with which index value to take from the row in the data array.

I have been trying to find an efficient way to do this so preferably something vectorized and simpler to understand if such a method exists.

CodePudding user response：

You just need to index in 2D.

>>> data[np.arange(len(labels)), labels]
array([0.2, 0.3])

CodePudding user response：

Apart from the indexing method mentioned by @wjandrea, its easy to see that the items you want will lie at the diagonal of the indexed matrix data[:,labels], or data.T[labels].

So try this -

data[:,labels].diagonal()

#OR

data.T[labels].diagonal()

array([0.2, 0.3])

Another way you can do this is by using np.take_along_axis which lets you pick specific indexes along an axis. In this case, that axis is 0. But you will have to reshape your labels to have 1 label per row.

np.take_along_axis(data, labels[:,None], axis=1)

array([[0.2],
       [0.3]])

This you can then transpose or use flatten by using output.ravel()