How to broadcast from 3-dimensional matrix using indices from 2-D matrix?-CodePudding

I have a specific matrix with dimensions (nz,ny,nx) and another matrix with dimensions (ny,nx). In this other matrix are specific values and for instance I want to sum all the points in this first 3-dimensional matrix at locations where the second matrix has a specific value.

I am doing the following:

    kkw = np.squeeze(np.where(wz==wval))

which has (2,X) elements and when I now try to do:

    serout = np.sum(dum[:,kkw],axis=(1,2))

I end with something that is completely wrong or even uses a lot of memory and gets killed in the end.

What is the correct way to broadcast/sum the values in the original matrix using another matrix for mask and a specific mask value to make this sum?

Here is the small example of the problem:

nz,ny,nx = 3,20,10
datain = np.random.random((nz,ny,nx))
mask = np.array(5 5.*np.random.random((ny,nx)),dtype='int');

kkw = np.squeeze(np.where(mask==5))

dataout = np.sum(datain[:,kkw],axis=(1,2))

So, in the end I would expect dataout to have 3 elements, but it has dimensions (3,20). Clearly, I am not doing the broadcasting correctly.

CodePudding user response：

Your sample arrays:

In [165]: nz,ny,nx = 3,20,10
     ...: datain = np.random.random((nz,ny,nx))
     ...: mask = np.array(5 5.*np.random.random((ny,nx)),dtype='int');
     ...: 
In [166]: datain.shape
Out[166]: (3, 20, 10)
In [167]: mask.shape
Out[167]: (20, 10)

Indices where mask is 5:

In [170]: np.where(mask==5)
Out[170]: 
(array([ 0,  0,  0,  0,  1,  1,  1,  2,  3,  3,  3,  4,  4,  4,  4,  5,  6,
         6,  6,  7,  7,  8,  8,  9,  9,  9,  9, 10, 10, 10, 11, 12, 13, 13,
        13, 13, 13, 15, 15, 15, 16, 16, 17, 17, 17, 18, 18, 18, 18, 18, 19,
        19]),
 array([1, 3, 6, 9, 3, 6, 8, 9, 1, 5, 7, 0, 2, 3, 6, 9, 1, 6, 9, 8, 9, 3,
        5, 1, 3, 4, 8, 0, 4, 9, 0, 2, 0, 3, 5, 6, 8, 2, 5, 9, 0, 1, 2, 5,
        8, 0, 4, 6, 8, 9, 3, 7]))

I haven't seen squeeze applied to this, but all it's doing is converting the tuple of arrays to a (2,52) array. np.argwhere applies np.transpose, producing a (52,2) array.

In [171]: np.squeeze(_)
Out[171]: 
array([[ 0,  0,  0,  0,  1,  1,  1,  2,  3,  3,  3,  4,  4,  4,  4,  5,
         6,  6,  6,  7,  7,  8,  8,  9,  9,  9,  9, 10, 10, 10, 11, 12,
        13, 13, 13, 13, 13, 15, 15, 15, 16, 16, 17, 17, 17, 18, 18, 18,
        18, 18, 19, 19],
       [ 1,  3,  6,  9,  3,  6,  8,  9,  1,  5,  7,  0,  2,  3,  6,  9,
         1,  6,  9,  8,  9,  3,  5,  1,  3,  4,  8,  0,  4,  9,  0,  2,
         0,  3,  5,  6,  8,  2,  5,  9,  0,  1,  2,  5,  8,  0,  4,  6,
         8,  9,  3,  7]])

The where tuple is easier to use when indexing.

In [172]: I,J = np.where(mask==5)
In [173]: I.shape
Out[173]: (52,)

Applied to mask itself we get all the 5 values:

In [174]: mask[I,J]
Out[174]: 
array([5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
       5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
       5, 5, 5, 5, 5, 5, 5, 5])

Applied to the 3d array:

In [175]: datain[:,I,J].shape
Out[175]: (3, 52)
In [176]: datain[:,I,J].sum(axis=1)
Out[176]: array([28.01960227, 26.66364387, 26.24709875])

datain[:,kkw] produces a (3,2,52,10) array. It's applying the (2,52) array to index just the size 20 dimension. Converting the where tuple to a 2d array does not help with this indexing. When indexing, the distinction between tuples, lists, and arrays is very important.