I'm trying to understand numpy fancy indexing. I still cannot differentiate the usage between np.array(...)
and plain-old python list [...]
passing into the (only) square-brackets of arr
where arr
is a np.array
. Here is the concrete example I'm using to learn by doing:
import numpy as np
print("version: %s" % np.__version__) # this prints 1.22.3
x = np.arange(10)
print(x)
print([x[3], x[7], x[2]])
print("---- x[ind_1d], ind_1d=[3,7,2]")
ind_1d = np.array([3, 7, 2])
ind_1d = [3, 7, 2]
print(x[ind_1d])
print("---- x[ind_2d], ind_2d=[[3,7],[4,5]]")
ind_2d = np.array([[3, 7], [4, 5]])
# ind_2d = [[3, 7], [4, 5]]
print(x[ind_2d], end="\n\n")
This program can run without any error/warning that I will mention below. But if I uncomment the line # ind_2d = [[3, 7], [4, 5]]
then I will get a warning:
FutureWarning: Using a non-tuple sequence for multidimensional indexing is
deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be
interpreted as an array index, `arr[np.array(seq)]`, which will result either in an
error or a different result.
and an error:
Traceback (most recent call last):
File ".../index.py", line 14, in <module>
print(x[ind_2d], end="\n\n")
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
update: what I've tried:
- If I set
ind_2d = [3,7],[4,5]
, so I'm changinglist
totuple
I still got the error. - If I set
ind_2d = [[3,7],[4,5]],
, so I'm adding an extra-layer tuple, then the program run without any error and warning.
Can anyone provide some rules to follow I can avoid these kinds of errors and/or warnings?
CodePudding user response:
The warning is telling us that indexing with a list should be the same as indexing with an array, but there are some legacy cases where it's treated as indexing with a tuple.
1d array:
In [1]: x = np.arange(10)
For this simple case, indexing with a list and array do the same thing:
In [2]: x[[3, 7, 2]]
Out[2]: array([3, 7, 2])
In [3]: x[np.array([3, 7, 2])]
Out[3]: array([3, 7, 2])
This is your problem case:
In [4]: x[[[3,7],[4,5]]]
<ipython-input-4-c3741544a3c2>:1: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
x[[[3,7],[4,5]]]
Traceback (most recent call last):
Input In [4] in <module>
x[[[3,7],[4,5]]]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
But using an equivalent array works, producing a (2,2) array from the 1d:
In [5]: x[np.array([[3,7],[4,5]])]
Out[5]:
array([[3, 7],
[4, 5]])
In [5], it is clear that the indexing array applies to the one-and-only dimension of x
.
[4] is a problem because in the past, certain lists were interpreted as though they were tuples. This is a legacy case that developers are trying to cleanup, hence the FutureWarning.
The 'too many indices' error without the FutureWarning:
In [6]: x[([3,7],[4,5])]
Traceback (most recent call last):
Input In [6] in <module>
x[([3,7],[4,5])]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
Let's try this with a 2d array:
In [8]: y = x.reshape(2, 5)
In [9]: y
Out[9]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
With a nested array, we get the FutureWarning, but it works the same as though we gave it a tuple of lists:
In [10]: y[[[1,0],[2,3]]]
<ipython-input-10-4285e452e4fe>:1: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
y[[[1,0],[2,3]]]
Out[10]: array([7, 3])
In [11]: y[([1,0],[2,3])]
Out[11]: array([7, 3])
[11] is the same as y[tuple([[1,0],[2,3]])]
as claimed by the warning. Wrapping the list in np.array
tries to index just the first dimension, resulting in an error because the 2,3 are too large.
In [12]: y[np.array([[1,0],[2,3]])]
Traceback (most recent call last):
Input In [12] in <module>
y[np.array([[1,0],[2,3]])]
IndexError: index 2 is out of bounds for axis 0 with size 2
===
Indexing a n-d array with a tuple is the same indexing without the ():
In [21]: y[(1, 1)]
Out[21]: 6
In [22]: y[1, 1]
Out[22]: 6
Technically, it's the comma that makes the tuple, not the () (though ()
and (1,)
are also tuples)
Indexing with a list is the same as indexing with an array (usually, except for the legacy cases they are trying clean up):
In [23]: y[[1, 1]]
Out[23]:
array([[5, 6, 7, 8, 9],
[5, 6, 7, 8, 9]])
In [24]: y[np.array([1, 1])]
Out[24]:
array([[5, 6, 7, 8, 9],
[5, 6, 7, 8, 9]])