I am not understanding numpy.take
though it seems like it is the function I want. I have an ndarray and I want to use another ndarray to index into the first.
import numpy as np
# Create a matrix
A = np.arange(75).reshape((5,5,3))
# Create the index array
idx = np.array([[1, 0, 0, 1, 1],
[1, 1, 0, 1, 1],
[1, 0, 1, 0, 1],
[1, 1, 0, 0, 0],
[1, 1, 1, 1, 0]])
Given the above, I want to index A
by the values in idx
. I thought take
does this, but it doesn't output what I expected.
# Index the 3rd dimension of the A matrix by the idx array.
Asub = np.take(A, idx)
print(f'Value in A at 1,1,1 is {A[1,1,1]}')
print(f'Desired index from idx {idx[1,1]}')
print(f'Value in Asub at [1,1,1] {Asub[1,1]} <- thought this would be 19')
I was expecting to see the value at the idx location one the value in A
based on idx
:
Value in A at 1,1,1 is 19
Desired index from idx 1
Value in Asub at [1,1,1] 1 <- thought this would be 19
CodePudding user response:
One possibility is to create row and col indices that broadcast
with the third dimension one, i.e a (5,1) and (5,) that pair with the (5,5) idx:
In [132]: A[np.arange(5)[:,None],np.arange(5), idx]
Out[132]:
array([[ 1, 3, 6, 10, 13],
[16, 19, 21, 25, 28],
[31, 33, 37, 39, 43],
[46, 49, 51, 54, 57],
[61, 64, 67, 70, 72]])
This ends up picking values from A[:,:,0]
and A[:,:,1]
. This takes the values of idx
as integers, in the range of valid (0,1,2) (for shape 3). They aren't boolean selectors.
Out[132][1,1]
is 19, same as A[1,1,1]
; Out[132][1,2]
is the same as A[1,2,0]
.
take_along_axis
gets the same values, but with an added dimension:
In [142]: np.take_along_axis(A, idx[:,:,None], 2).shape
Out[142]: (5, 5, 1)
In [143]: np.take_along_axis(A, idx[:,:,None], 2)[:,:,0]
Out[143]:
array([[ 1, 3, 6, 10, 13],
[16, 19, 21, 25, 28],
[31, 33, 37, 39, 43],
[46, 49, 51, 54, 57],
[61, 64, 67, 70, 72]])
The iterative equivalent might be easier to understand:
In [145]: np.array([[A[i,j,idx[i,j]] for j in range(5)] for i in range(5)])
Out[145]:
array([[ 1, 3, 6, 10, 13],
[16, 19, 21, 25, 28],
[31, 33, 37, 39, 43],
[46, 49, 51, 54, 57],
[61, 64, 67, 70, 72]])
If you have trouble expressing an action in "vectorized" array ways, go ahead an write an integrative version. It will avoid a lot of ambiguity and misunderstanding.
Another way to get the same values, treating the idx
values as True/False booleans is:
In [146]: np.where(idx, A[:,:,1], A[:,:,0])
Out[146]:
array([[ 1, 3, 6, 10, 13],
[16, 19, 21, 25, 28],
[31, 33, 37, 39, 43],
[46, 49, 51, 54, 57],
[61, 64, 67, 70, 72]])
CodePudding user response:
IIUC, you can get the resulted array by broadcasting the idx array, to make its shape same as A
to be multiplied, and then indexing to get the column 1
as:
Asub = (A * idx[:, :, None])[:, :, 1] # --> Asub[1, 1] = 19
# [[ 1 0 0 10 13]
# [16 19 0 25 28]
# [31 0 37 0 43]
# [46 49 0 0 0]
# [61 64 67 70 0]]
I think it be the fastest way (or one of the bests), particularly for large arrays.