Home > Enterprise >  Count how often values in a 2D array appear in a 3D array
Count how often values in a 2D array appear in a 3D array

Time:12-09

I have one 2-dimensional numpy array and another 3D-dimensional array. For each number in the first array I would like to count how often this value or an extremer one appears in the second array (taking the 3rd dimension as comparison vector for each element in the first array). For 0 values the function should return np.nan since it's not possible here to decide if 0s should be compared to negative or positive numbers.

EDIT: With 'extreme' I mean that positive values in a should only be compared with positive values in b and negative values only with negative values in b.

Example:

import numpy as np

np.random.seed(42)

# this is the 2D array
a = np.random.randint(low=-5, high=5, size=(5, 5))

# for each value in a, count how often this value or an extremer one appears
# in b (taking the last dimension of b as comparison vectors)
b = np.random.randint(low=-5, high=5, size=(5, 5, 5))

# expected result
result = np.array([[2, 2, 1, 1, 2],
                   [0, 1, 1, 3, 3],
                   [1, 3, 2, 2, np.nan],
                   [2, 0, 0, np.nan, 1],
                   [3, 0, 1, np.nan, 3]])

CodePudding user response:

For all these operations, you will want to transfrom a into a 3D array to utilize broadcasting:

a3 = a[..., None]

You can use np.sign to normalize the direction of the extrema:

s = np.sign(a3)
a3 *= s
b3 = b * s

Now all your extrema are positive, so you can count the number of times something is greater than or equal to the corresponding element of a3:

result = (b3 >= a3).sum(axis=-1)

If you want to set zero elements to np.nan, you will first need a floating point array. The simplest way to get one is to specify the dtype in the previous line:

result = (b3 >= a3).sum(axis=-1, dtype=float)
result[a == 0] = np.nan

This can be written more concisely as:

s = np.sign(a)[..., None]
result = (b * s >= a[..., None] * s).sum(axis=-1, dtype=float)
result[a == 0] = np.nan

CodePudding user response:

Try using np.count()

This functiom should return what you want.

  • Related