Home > Software engineering >  How to find the region of a number in a 1D array
How to find the region of a number in a 1D array

Time:08-05

I have no idea how to search for this question so apologies if this is a duplicate.

Anyway: I have a series of breakpoints in 1D. Let's say those breakpoints are [-1, 0, 1]. This splits the 1D space into 4 regions: x < -1, -1 <= x < 0, 0 <= x < 1, x >= 1. What I want is, given some value of x, I want to find which region it would fall in (let's say as a symbol in the alphabet).

While nested if-thens would work when there are few breakpoints, this would be cumbersome if I have many. Is there any simpler way that will work for any number of breakpoints? Numpy should have something...

CodePudding user response:

Yes, we can vectorize this using numpy. The trick to find the bin of a value is to take the delta of that value with the boundary array to get a delta array, then check for the index i where this delta array is nonnegative at i but negative at i 1. More specifically, it can be done as follows

boundaries = np.array([-np.inf, -1., 0., 1., np.inf]) # put breakpoints between -np.inf and np.inf
values = np.array([-1., -0.25, 0.5, 1.])  # values whose bin you want to search
delta = values[:,None] - boundaries[None,:]
mask = (delta[:, :-1] >= 0) & (delta[:, 1:] < 0)

Running this gives the mask as

array([[False,  True, False, False],
       [False,  True, False, False],
       [False, False,  True, False],
       [False, False, False,  True]])

where j-th row only contains one True element denoting the bin that the j-th element belongs to

To get the concrete bin boundaries simply do

left_boundary_index = np.where(mask)[1]
np.stack([boundaries[left_boundary_index], boundaries[left_boundary_index   1]], axis=-1)

which gives

array([[-1.,  0.],
       [-1.,  0.],
       [ 0.,  1.],
       [ 1., inf]])

CodePudding user response:

I think there need to use binary search:

def find_range(points,x):
    points.sort()
    low = 0
    high = len(points) - 1

    if x>points[high]:
        return [points[high],'Inf']
    if x<points[low]:
        return ['-Inf',points[low]]

    while high-low > 1:
        mid = (low   high)//2
        midVal = points[mid]
        if midVal < x:
            low = mid
        elif midVal >= x:
            high = mid
    return [points[low],points[high]]

points=[-1,0,1,3,5]
print(find_range(points,6))
  • Related