Home > front end >  Indexing a `numpy.array` with a `dask.array`
Indexing a `numpy.array` with a `dask.array`

Time:02-27

I am getting errors when indexing a numpy.array with a dask.array, I'm not sure whether this is a feature or a bug.

In [1]: import numpy as np
In [2]: import dask.array as dka
In [3]: foo = np.arange(10)
In [4]: bar = np.arange(3)

In [5]: foo[bar]
Out[5]: array([0, 1, 2])

In [6]: foo[dka.from_array(bar)]
<ipython-input-16-9c4b06c0d0c4>:1: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  foo[dka.from_array(bar)]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-16-9c4b06c0d0c4> in <module>
----> 1 foo[dka.from_array(bar)]

IndexError: too many indices for array: array is 1-dimensional, but 3 were indexed

I am using versions dask==2022.01.0 and numpy==1.22.0.

CodePudding user response:

The problem, as the error says (albeit in a not-so-clear way), is that numpy doesn't recognize dask arrays as a supported type, but does recognize them as a sequence, and so is interpreting your indexing operation as an attempt to do multidimensional indexing.

If you make it clear that you're trying to provide an array of indices into the first dimension, this works just fine:

In [5]: foo[dka.from_array(bar), ...]
Out[5]: array([0, 1, 2])

or equivalently, you could use np.take or any of a number of indexing options:

In [6]: np.take(foo, dka.from_array(bar), axis=0)
Out[6]: array([0, 1, 2])
  • Related