Home > Enterprise >  Numpy: using a 4D Boolean matrix to apply mathematical calculation to associated 3D matrix
Numpy: using a 4D Boolean matrix to apply mathematical calculation to associated 3D matrix

Time:07-12

I have a 4D Numpy matrix E containing Booleans and with shape (3, 3, 4, 3), which results from:

import numpy as np

threshold = 2

A = np.array([ [ [90,  84,  88], [10, 30, 17], [7,  0,  4] ], 
               [ [88,  83, 102], [12, 14, 15], [12, 17, 7]], 
               [ [94,  14,  85], [8,  23, 20], [25, 5,27]], 
               [ [150, 90, 103], [9,  16, 21], [17, 7, 12] ] ])

A

array([[[ 90,  84,  88],
        [ 10,  30,  17],
        [  7,   0,   4]],

       [[ 88,  83, 102],
        [ 12,  14,  15],
        [ 12,  17,   7]],

       [[ 94,  14,  85],
        [  8,  23,  20],
        [ 25,   5,  27]],

       [[150,  90, 103],
        [  9,  16,  21],
        [ 17,   7,  12]]])


# create identity matrix by subtracting axis 0 elements
B = A.T[..., None, :] - A.T[...,None]

# remove upper triangle and take abs
C = np.tril(B, k=0)
C = np.absolute(C)

# delete unnecessary column
D = np.delete(C, 3, axis=3)
D

array([[[[ 0,  0,  0],
         [ 2,  0,  0],
         [ 4,  6,  0],
         [60, 62, 56]],

        [[ 0,  0,  0],
         [ 2,  0,  0],
         [ 2,  4,  0],
         [ 1,  3,  1]],

        [[ 0,  0,  0],
         [ 5,  0,  0],
         [18, 13,  0],
         [10,  5,  8]]],


       [[[ 0,  0,  0],
         [ 1,  0,  0],
         [70, 69,  0],
         [ 6,  7, 76]],

        [[ 0,  0,  0],
         [16,  0,  0],
         [ 7,  9,  0],
         [14,  2,  7]],

        [[ 0,  0,  0],
         [17,  0,  0],
         [ 5, 12,  0],
         [ 7, 10,  2]]],


       [[[ 0,  0,  0],
         [14,  0,  0],
         [ 3, 17,  0],
         [15,  1, 18]],

        [[ 0,  0,  0],
         [ 2,  0,  0],
         [ 3,  5,  0],
         [ 4,  6,  1]],

        [[ 0,  0,  0],
         [ 3,  0,  0],
         [23, 20,  0],
         [ 8,  5, 15]]]])

E = np.where(D <= threshold, True, False)
E

array([[[[ True,  True,  True],
         [ True,  True,  True],
         [False, False,  True],
         [False, False, False]],

        [[ True,  True,  True],
         [ True,  True,  True],
         [ True, False,  True],
         [ True, False,  True]],

        [[ True,  True,  True],
         [False,  True,  True],
         [False, False,  True],
         [False, False, False]]],


       [[[ True,  True,  True],
         [ True,  True,  True],
         [False, False,  True],
         [False, False, False]],

        [[ True,  True,  True],
         [False,  True,  True],
         [False, False,  True],
         [False,  True, False]],

        [[ True,  True,  True],
         [False,  True,  True],
         [False, False,  True],
         [False, False,  True]]],


       [[[ True,  True,  True],
         [False,  True,  True],
         [False, False,  True],
         [False,  True, False]],

        [[ True,  True,  True],
         [ True,  True,  True],
         [False, False,  True],
         [False, False,  True]],

        [[ True,  True,  True],
         [False,  True,  True],
         [False, False,  True],
         [False, False, False]]]])

The original matrix A is of shape (4, 3, 3). Given the mismatch in matrix shapes, a Boolean mask does not seem to work.

I want to take the average of axis 0 in matrix A, but only if the associated value of E is True (and applying np.nan if there is no True value to average along a given axis 0).

The desired output looks as follows:

array([[ 89.  ,  83.5 , 102.5 ],
       [  9.5 ,  15.  ,  18.25],
       [   nan,   6.  ,    nan]])

How do I do this?

Thanks!

CodePudding user response:

A has shape (4,3,3)

B and C are based of off A.T, (3,3,4), with broadcasting, making a (3,3,4,4). This first 2 dimensions of C correspond to the last 2 of A.

D and E further distance themselves with that delete. There last dimension has to relation to the dimensions of A.

In short the "associated values" of E is not clearly defined. We could tranpose E back to make it (3,4,3,3), but we still have a "surplus" initial 3 dimension, that has not relation to A.

I suppose you do some sort of all/any on E, axis 0, to make a (4,3,3)

In [77]: F = E.transpose().all(axis=0) 
In [78]: F.shape
Out[78]: (4, 3, 3)
In [79]: A[F].shape
Out[79]: (13,)
In [80]: A[F]
Out[80]: array([90, 84, 88, 10, 30, 17,  7,  0,  4, 88, 83, 12, 15])

Or with multiplication make a (3,4,3,3) array that is 0 where E.T is 0:

In [85]: ae = A*E.T
In [86]: ae.shape
Out[86]: (3, 4, 3, 3)

and take a mean on the first 2 dim:

In [88]: np.mean(ae, axis=(0,1))
Out[88]: 
array([[52.33333333, 42.91666667, 54.66666667],
       [ 8.33333333, 13.08333333, 11.41666667],
       [ 5.83333333,  3.83333333,  4.41666667]])

another approach - sum both the ae and E.T

In [91]: ae.sum((0,1))/E.T.sum((0,1))
Out[91]: 
array([[89.71428571, 73.57142857, 93.71428571],
       [10.        , 22.42857143, 17.125     ],
       [11.66666667,  6.57142857,  8.83333333]])

In [92]: E.T.sum((0,1))
Out[92]: 
array([[ 7,  7,  7],
       [10,  7,  8],
       [ 6,  7,  6]])
  • Related