Home > Software design >  Problem applying an operation by NumPy advanced indexing
Problem applying an operation by NumPy advanced indexing

Time:05-25

I have two arrays by shape c = (2, 10, 2, 3) and p = (2, 5, 3). I want to pick the first vectors in the 3d dimension of c and minus it from each vectors in the corresponding row of p. It will be better understandable by this example, we can do this operation on the first row by c[0, :, None, 0] - p[0] or c[0, :, 0][:, None] - p[0], which will get result with shape (10, 5, 3). But I have confused how to apply this operation on all rows at once by advanced indexing, which result will be as shape (2, 10, 5, 3).

p = np.array([[[0., -0.05,  0.], [0., -0.05,  0.], [0.,  0.,  0.], [-0.05, -0.1,  0.05], [0.1,  0.2,  0.1]],
              [[0., -0.05,  0.], [0., -0.05,  0.], [0.,  0.,  0.], [-0.05, -0.1,  0.05], [0.1,  0.2,  0.1]]])

c = np.array([[[[ 0.05,  0. ,  0.05], [ 0.05, -0.1,  0.05]], [[ 0.05, -0.1,  0.05], [-0.05, -0.1,  0.05]],
               [[-0.05, -0.1,  0.05], [-0.05,  0. ,  0.05]], [[ 0.05,  0. , -0.05], [ 0.05, -0.1, -0.05]],
               [[ 0.05, -0.1, -0.05], [-0.05, -0.1, -0.05]], [[-0.05, -0.1, -0.05], [-0.05,  0. , -0.05]],
               [[ 0.05,  0. ,  0.05], [ 0.05,  0. , -0.05]], [[ 0.05, -0.1,  0.05], [ 0.05, -0.1, -0.05]],
               [[-0.05, -0.1,  0.05], [-0.05, -0.1, -0.05]], [[-0.05,  0. ,  0.05], [-0.05,  0. , -0.05]]],
              [[[ 0.05,  0.1,  0.05], [ 0.05,  0. ,  0.05]], [[ 0.05,  0. ,  0.05], [-0.05,  0. ,  0.05]],
               [[-0.05,  0. ,  0.05], [-0.05,  0.1,  0.05]], [[ 0.05,  0.1, -0.05], [ 0.05,  0. , -0.05]],
               [[ 0.05,  0. , -0.05], [-0.05,  0. , -0.05]], [[-0.05,  0. , -0.05], [-0.05,  0.1, -0.05]],
               [[ 0.05,  0.1,  0.05], [ 0.05,  0.1, -0.05]], [[ 0.05,  0. ,  0.05], [ 0.05,  0. , -0.05]],
               [[-0.05,  0. ,  0.05], [-0.05,  0. , -0.05]], [[-0.05,  0.1,  0.05], [-0.05,  0.1, -0.05]]]])

additional suggestion:
It is not the main issue, but will be helpful for me and I will be grateful for any alternative way to handle this:
I have broadcasted p from shape (5, 3) to (2, 5, 3) to be corresponded with 1st dimension of c. Is there a better way, without this broadcasting (may be just by advanced indexing instead), to handle this problem?

CodePudding user response:

Let's start here:

I have broadcasted p from shape (5, 3) to (2, 5, 3) to be corresponded with 1st dimension of c. Is there a better way, without this broadcasting (may be just by advanced indexing instead), to handle this problem?

So, if I understand correctly, p should actually be (5, 3), you doubled it manually to make a (2, 5, 3) array. As you'll see, you do not have to do that, actually sticking to a (5, 3) array is going to make the whole thing much easier.

Now when you do this:

c[0, :, None, 0]

What you get is a (10, 1, 3) array. You're on the right track here, as the second dimension of 1 is what you'll need for broadcasting, but again, you can do even simpler with

c[:, :, None, 0]

which will give you a (2, 10, 1, 3) array. Which is something you really much need in order to take advantage of numpy's broadcasting rules:

When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when: 1. they are equal, or 2. one of them is 1

So, if we have p.shape == (5, 3) and c transformed so that c.shape == (2, 10, 1, 3), the matching will go right from left:

  • last dimensions will be matched as they are both 3 (rule 1)
  • the second to last of c being 1, it can be matched with the first of p (rule 2)
  • the shorter (p) is then going to be matched along the remaining c dimensions

Schematically:

 (2, 10, 1, 3)
  |  |   |  |
  |  |   V  V
  |  |  (5, 3)
  |  |   |  |
  V  V   V  V
 (2, 10, 5, 3) 

Long explanation but actually the application is going to be very simple. So, let's try it:

import numpy as np

# let's keep this one at (5, 3)
p = np.array([[0., -0.05,  0.], [0., -0.05,  0.], [0.,  0.,  0.], [-0.05, -0.1,  0.05], [0.1,  0.2,  0.1]])

c = np.array([[[[ 0.05,  0. ,  0.05], [ 0.05, -0.1,  0.05]], [[ 0.05, -0.1,  0.05], [-0.05, -0.1,  0.05]],
               [[-0.05, -0.1,  0.05], [-0.05,  0. ,  0.05]], [[ 0.05,  0. , -0.05], [ 0.05, -0.1, -0.05]],
               [[ 0.05, -0.1, -0.05], [-0.05, -0.1, -0.05]], [[-0.05, -0.1, -0.05], [-0.05,  0. , -0.05]],
               [[ 0.05,  0. ,  0.05], [ 0.05,  0. , -0.05]], [[ 0.05, -0.1,  0.05], [ 0.05, -0.1, -0.05]],
               [[-0.05, -0.1,  0.05], [-0.05, -0.1, -0.05]], [[-0.05,  0. ,  0.05], [-0.05,  0. , -0.05]]],
              [[[ 0.05,  0.1,  0.05], [ 0.05,  0. ,  0.05]], [[ 0.05,  0. ,  0.05], [-0.05,  0. ,  0.05]],
               [[-0.05,  0. ,  0.05], [-0.05,  0.1,  0.05]], [[ 0.05,  0.1, -0.05], [ 0.05,  0. , -0.05]],
               [[ 0.05,  0. , -0.05], [-0.05,  0. , -0.05]], [[-0.05,  0. , -0.05], [-0.05,  0.1, -0.05]],
               [[ 0.05,  0.1,  0.05], [ 0.05,  0.1, -0.05]], [[ 0.05,  0. ,  0.05], [ 0.05,  0. , -0.05]],
               [[-0.05,  0. ,  0.05], [-0.05,  0. , -0.05]], [[-0.05,  0.1,  0.05], [-0.05,  0.1, -0.05]]]])

# now, it's really as simple as that
result = c[:, :, None, 0] - p  # shape: (2, 10, 5, 3)
  • Related