Home > Net >  Numpy indexing oddity: How to subselect from multidimensional array and keep all axes
Numpy indexing oddity: How to subselect from multidimensional array and keep all axes

Time:01-11

I have a multi-dimensional array, and have two lists of integers, L_i and L_j, corresponding to the elements of axis-i and axis-j I want to keep. I also want to satisfy the following:

  1. Keep original dimensionality of the array, even if L_i or L_j consists of just 1 element (in other words I dont want a singleton axis to be collapsed)
  2. Preserve the order of the axes

What is the cleanest way to do this?

Here is a reproducible example that shows some of the unexpected behavior I've been getting:

import numpy as np
aa = np.arange(120).reshape(5,4,3,2)
aa.shape
### (5,4,3,2) as expected

aa[:,:,:,[0,1]].shape
### (5, 4, 3, 2) as expected

aa[:,:,:,[0]].shape
### (5,4,3,1) as desired. Notice that even though the [0] is one element, 
### that last axis is preserved, which is what I want

aa[:,[1,3],:,[0]].shape
### (2, 5, 3) NOT WHAT I EXPECTED!!
### I was expecting (5, 2, 3, 1)

Curious as to why numpy is collapsing and reordering axes, and also best way to do my subsetting correctly.

CodePudding user response:

Reduce the axes one at a time:

aa[:, [1, 3], :, :][..., [0]].shape
(5, 2, 3, 1)

CodePudding user response:

Regarding the answers to your questions...

  • Why is numpy collapsing the axes?

Because advanced indices [1,3] and [0] are broadcast together to form a shape (2,) subspace which replaces the subspace they index (i.e. the axes with size 4 and 2 respectively).

  • Why is numpy reordering the axes?

Because the advanced indices are separated by a slice, there is no unambiguous place to drop the new shape (2,) subspace. As a result, numpy places it at the front of the array, with the sliced dimensions trailing afterward (shape (5, 3)).

... and thus you are left with a shape (2, 5, 3) array.

For more info, see the section in the numpy guide on combining basic and advanced indexing.


PS: It is still possible to get your desired shape using just a single indexing call, but you'll have to part ways with the slices and instead define indices that broadcast to shape (5, 2, 3, 1), for instance using np.ix_:

>>> aa[ np.ix_([0, 1, 2, 3, 4], [1, 3], [0, 1, 2], [0]) ].shape
(5, 2, 3, 1)
  • Related