How to find the region of a number in a 1D array-CodePudding

I have no idea how to search for this question so apologies if this is a duplicate.

Anyway: I have a series of breakpoints in 1D. Let's say those breakpoints are [-1, 0, 1]. This splits the 1D space into 4 regions: x < -1, -1 <= x < 0, 0 <= x < 1, x >= 1. What I want is, given some value of x, I want to find which region it would fall in (let's say as a symbol in the alphabet).

While nested if-thens would work when there are few breakpoints, this would be cumbersome if I have many. Is there any simpler way that will work for any number of breakpoints? Numpy should have something...

CodePudding user response：

Yes, we can vectorize this using numpy. The trick to find the bin of a value is to take the delta of that value with the boundary array to get a delta array, then check for the index i where this delta array is nonnegative at i but negative at i 1. More specifically, it can be done as follows

boundaries = np.array([-np.inf, -1., 0., 1., np.inf]) # put breakpoints between -np.inf and np.inf
values = np.array([-1., -0.25, 0.5, 1.])  # values whose bin you want to search
delta = values[:,None] - boundaries[None,:]
mask = (delta[:, :-1] >= 0) & (delta[:, 1:] < 0)

Running this gives the mask as

array([[False,  True, False, False],
       [False,  True, False, False],
       [False, False,  True, False],
       [False, False, False,  True]])

where j-th row only contains one True element denoting the bin that the j-th element belongs to

To get the concrete bin boundaries simply do

left_boundary_index = np.where(mask)[1]
np.stack([boundaries[left_boundary_index], boundaries[left_boundary_index   1]], axis=-1)

which gives

array([[-1.,  0.],
       [-1.,  0.],
       [ 0.,  1.],
       [ 1., inf]])

CodePudding user response：

I think there need to use binary search:

def find_range(points,x):
    points.sort()
    low = 0
    high = len(points) - 1

    if x>points[high]:
        return [points[high],'Inf']
    if x<points[low]:
        return ['-Inf',points[low]]

    while high-low > 1:
        mid = (low   high)//2
        midVal = points[mid]
        if midVal < x:
            low = mid
        elif midVal >= x:
            high = mid
    return [points[low],points[high]]

points=[-1,0,1,3,5]
print(find_range(points,6))