Home > Back-end >  check if value is between two values in numpy array
check if value is between two values in numpy array

Time:03-08

how to check if given number is in between two numbers row wise . for e.g - 101 from b is in 100 and 200 of a i.e 0 index of a .

a = np.array([
     [100, 200],# 101 between 100 and 200
     [150, 160] # 156 between 150 and 160

             ])

b = np.array([

          [101], [156] , [300]

            ])

answer must be

for 101 - True , false 
for 156 - false, true
for 300 - false , false

CodePudding user response:

To check whether particular number (n) is in the range between borders defined by each row in a 2-column table (tbl), define the following function:

def isInRange(n, tbl):
    return np.apply_along_axis(lambda row: row[0] <= n <= row[1], 1, tbl)\
        .tolist()

Then, to get the result for all elements in b, run the following list comprehension:

result = [isInRange(x, a) for x in np.nditer(b)]

The result, for your data sample, is:

[[True, False], [True, True], [False, False]]

Note that the result for 156 is [True, True] (different to what you wrote), because 156 is within both ranges defined by rows of a table.

CodePudding user response:

The use of nditer does not help with times:

In [42]: result = [isInRange(x, a) for x in np.nditer(b)]
In [43]: result
Out[43]: [[True, False], [True, True], [False, False]]
In [44]: timeit result = [isInRange(x, a) for x in np.nditer(b)]
262 µs ± 9.59 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Iterating on b directly, or on the ravel to get rid of the inner dimension:

In [45]: result = [isInRange(x, a) for x in b.ravel()]
In [46]: result
Out[46]: [[True, False], [True, True], [False, False]]
In [47]: timeit result = [isInRange(x, a) for x in b.ravel()]
230 µs ± 13 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

The use of apply_along_axis also does not help, especially in the case of a 1d array. An equivalent list comprehension:

def foo(n, tbl):
    return [row[0] <= n <= row[1] for row in tbl]

Is much faster:

In [50]: result = [foo(x, a) for x in np.nditer(b)]
In [51]: result
Out[51]: [[True, False], [True, True], [False, False]]

In [53]: timeit result = [foo(x, a) for x in np.nditer(b)]
41.7 µs ± 1.7 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [54]: timeit result = [foo(x, a) for x in b.ravel()]
12.2 µs ± 99.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

But we don't have to do any (python level) iteration:

In [59]: ((a[:, 0] <= b) & (b <= a[:, 1])).tolist()
Out[59]: [[True, False], [True, True], [False, False]]
In [60]: timeit((a[:, 0] <= b) & (b <= a[:, 1])).tolist()
12.3 µs ± 3.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

For this small example the pure list code is just as fast, but the array approach will scale better.

  • Related