numpy.searchsorted(a, v, ...)
basically finds the indices of the first elements in a sorted array a
that are larger than or equal to the elements in v. I assume that this is faster than a method that does not take advantage of the fact that a
is sorted.
Now what if v
is also sorted? Certainly it should be possible to exploit this knowledge to make the algorithm faster? (Perhaps I'm wrong and the design of searchsorted
makes this irrelevant; if so please excuse my question.)
So the question is: What would be the fastest way to do the same as searchsorted(a, v)
but for both a
and v
sorted?
CodePudding user response:
Off the top of my head, I would assume that you could solve the problem thusly:
def searchsorted_sorted(x, a):
y = np.empty_like(a)
l = 0
for (i, s) in enumerate(a):
pos = l np.searchsorted(x[l:], s)
y[i] = pos
l = y[i]
return y
So, you could use the knowledge of v being ordered to restrict the search space for subsequent searches.
Since numpy likely uses binary search, the difference should be more pronounced when the values in v
are cluttered towards the end of a
. If you have a large number M of entries and only search for the indices of the last n, the second to n-th search restrict the length to be searched from M to M - n. Thus, rather than taking O(n*log(M))
you would only need O(n*log(n))
, which in theory makes a difference. Caveats:
- Maybe
numpy
already does something similar? - Since
numpy
is implemented in C, it is not a fair comparison with this function (partly) written in python. - This implementation is rather cache-unfriendly, further reducing its usefulness.
CodePudding user response:
When a
and v
have approximately the same size, the best algorithm consist in merging the two arrays (see: merge algorithm) (note you only need the indices and not the output merged array). This can be done in linear time (ie. O(n M)
) instead of a quasi-linear O(n log M)
time.
Otherwise the problem can be solved in O(n log log M)
time instead of O(n log n)
time. This is only interesting if a
is huge since log M
is already quite small and one should be careful about hidden constants in algorithmic complexities. The algorithm is not trivial to implement. Here is the idea:
- Pick the middle value
v[k]
ofv
and split virtuallyv
in two part (left and right ones) - Perform a binary search of
v[k]
ina
so to get its locationp
- Now we can be sure that the items in the left part of
v
are located ina[:p 1]
and the right part are ina[p:]
(one side can be excluded in case of equality but let put this aside for sake of simplicity) - Recursively apply the operation by calling
search(a[:p 1], v[:k])
andsearch(a[p:], v[k 1:])
Note that for the second algorithm, if v
contains i
multiple same values, then you can copy the location found i
times and find a more strict lower/upper bound in a
so to avoid degenerated case possibly resulting in a worse complexity.
Because of the CPython loop overhead and the one of Numpy function calls, this implementation needs to be implemented in native languages like C/C or in Cython or possibly using JITs like Numba.