I have no idea how to search for this question so apologies if this is a duplicate.
Anyway: I have a series of breakpoints in 1D. Let's say those breakpoints are [-1, 0, 1]. This splits the 1D space into 4 regions: x < -1, -1 <= x < 0, 0 <= x < 1, x >= 1
. What I want is, given some value of x
, I want to find which region it would fall in (let's say as a symbol in the alphabet).
While nested if-thens would work when there are few breakpoints, this would be cumbersome if I have many. Is there any simpler way that will work for any number of breakpoints? Numpy should have something...
CodePudding user response:
Yes, we can vectorize this using numpy. The trick to find the bin of a value is to take the delta of that value with the boundary array to get a delta array, then check for the index i
where this delta array is nonnegative at i
but negative at i 1
. More specifically, it can be done as follows
boundaries = np.array([-np.inf, -1., 0., 1., np.inf]) # put breakpoints between -np.inf and np.inf
values = np.array([-1., -0.25, 0.5, 1.]) # values whose bin you want to search
delta = values[:,None] - boundaries[None,:]
mask = (delta[:, :-1] >= 0) & (delta[:, 1:] < 0)
Running this gives the mask
as
array([[False, True, False, False],
[False, True, False, False],
[False, False, True, False],
[False, False, False, True]])
where j-th row only contains one True
element denoting the bin that the j-th element belongs to
To get the concrete bin boundaries simply do
left_boundary_index = np.where(mask)[1]
np.stack([boundaries[left_boundary_index], boundaries[left_boundary_index 1]], axis=-1)
which gives
array([[-1., 0.],
[-1., 0.],
[ 0., 1.],
[ 1., inf]])
CodePudding user response:
I think there need to use binary search:
def find_range(points,x):
points.sort()
low = 0
high = len(points) - 1
if x>points[high]:
return [points[high],'Inf']
if x<points[low]:
return ['-Inf',points[low]]
while high-low > 1:
mid = (low high)//2
midVal = points[mid]
if midVal < x:
low = mid
elif midVal >= x:
high = mid
return [points[low],points[high]]
points=[-1,0,1,3,5]
print(find_range(points,6))