I'm trying to replicate the following code from MATLAB in Python:
X = [1,2,3;
4,5,6];
idx = [2,1];
idy = [3,2];
x = X(idx,idy)
Output:
x =
6 5
3 2
I wish it was as simple as:
X = np.array([[1,2,3],[4,5,6]])
idx = np.array([1,0])
idy = np.array([2,1])
x = X[idx,idy]
but that is not the case.
What am I missing?
Second Example:
X = [0.677888793318259 0.141577665623206 0.175278629658347 0.00525491140018830
0.684907327765215 0.137981388019339 0.172732191381956 0.00437909283349025
0.691925862212172 0.134385110415471 0.170185753105565 0.00350327426679220
0.698944396659129 0.130788832811603 0.167639314829174 0.00262745570009415
0.705962931106086 0.127192555207735 0.165092876552782 0.00175163713339610
0.712981465553043 0.123596277603868 0.162546438276391 0.000875818566698050
0.720000000000000 0.120000000000000 0.160000000000000 0]
idx = [1 1 1 1 1 1 0]
idy = [1 1 1 1]
Correct Output:
x = [0.677888793318259 0.141577665623206 0.175278629658347 0.00525491140018830
0.684907327765215 0.137981388019339 0.172732191381956 0.00437909283349025
0.691925862212172 0.134385110415471 0.170185753105565 0.00350327426679220
0.698944396659129 0.130788832811603 0.167639314829174 0.00262745570009415
0.705962931106086 0.127192555207735 0.165092876552782 0.00175163713339610
0.712981465553043 0.123596277603868 0.162546438276391 0.000875818566698050]
CodePudding user response:
MATLAB automatically indexes the cross product.
The numpy equivalent is np.ix_
:
Using
ix_
one can quickly construct index arrays that will index the cross product.a[np.ix_([1,3],[2,5])]
returns the array[[a[1,2] a[1,5]], [a[3,2] a[3,5]]]
.
>>> X[np.ix_(idx, idy)]
# array([[6, 5],
# [3, 2]])
Or as Michael commented, you can index each dimension explicitly:
>>> X[idx][:, idy]
# array([[6, 5],
# [3, 2]])
CodePudding user response:
Integer Indexing
In numpy, indexing with a list of integers is called advanced, or fancy indexing. the shape of the output is the shape of the broadcasted input indices. This is more versatile than the Matlab notation, which forces broadcasting whether you ask for it or not. The matlab equivalent of non-broadcasted indexing is using linear indices, which is much more awkward. To get two index arrays to make a grid, they just need to have their non-unit dimension along the appropriate axis:
X = np.array([
[1, 2, 3],
[4, 5, 6]])
idx = np.array([1, 0])
idy = np.array([2, 1])
X[idx[:, None], idy]
The function np.ix_
wraps the reshape to an arbitrary number of axes:
X[np.ix_[idx, idy]]
Boolean Indexing
Your second example uses logical masks, or what in numpy is called boolean indexing, or masking. To tell numpy that you want to use a boolean mask, you have to pass in a boolean array. Using a list of python bool
s is one way to do it. Another is to explicitly set dtype=bool
in the index:
idx = np.array([True, True])
idy = np.array([1, 0, 1], dtype=bool)
X[np.ix_(idx, idy)]
Notice that for this application, np.ix_
converts the indices into integers:
>>> np.ix_(idx, idy)
(array([[0],
[1]]),
array([[0, 1, 2]]))
You can write the equivalent expression using np.flatnonzero
and the reshape technique used for the first part:
X[np.flatnonzero(idx)[:, None], np.flatnonzero(idy)]
Numpy boolean masks can contain any combination of True
and False
values, as long as the mask has the same shape as the leading dimensions of the array that it is applied to. That means that you can construct the output mask as idx[:, None] & idy
to get the shape you need.
>>> X[idx[:, None] & idy]
array([1, 3, 4, 6])
Because masks can contain any number of elements, the result of indexing this way will be a flat array, unlike the result of integer indexing. This part of the reason that ix_
converts to integers first. You can get around this by counting the number of non-zero elements in each of the axis masks, e.g. with np.count_nonzero
, or np.sum
:
>>> X[idx[:, None] & idy].reshape(np.count_nonzero(idx), np.sum(idy))
array([[1, 3],
[4, 6]])