Smallest interval whose sum exceeds a threshold-CodePudding

In a non-negative 1D array x, we wish to find the shortest slice such that x[start:end].sum() > threshold, with the optional constraint that start <= argmax(x) < end. Must work with floats and integers.

Is there a known algorithm to accomplish this? I have implemented one here, but it's not as fast as I wish, and ideally there shouldn't be a heavyweight dependency (e.g. numba). Example:

x = [0, 1, 2, 3, 4, 9, 7, 1, 0, 0, 0]
threshold = 3   4   9   7 - .01

start, end = func(x, threshold)
print(x[start:end])

[3 4 9 7]

@גלעד ברקן's answer is acceptably fast in numba; my implem. Though, non-numba is strongly preferred, meaning minimizing if, for, etc - unless someone can point to a lightweight numba...

CodePudding user response：

Just my few cents:

Because the array is non-negative the list containing running sum at each index is already a sorted list.
Compute running sum at each index and construct a list of tuples (running_sum, current_index).
If the running sum at an index exceeds the threshold look for the greatest value that is less than (running_sum - threshold) in the list of running sums excluding the one at current index using binary search
Now the current range is the [index_found_by_binary_search 1, curr_index]
Minimize upon all such ranges.

In your example: the running sum is: [0, 1, 3, 6, 10, 19, 26, 27, 27, 27, 27]

The sum(26) exceeds threshold of 22.99 at index = 6

Now difference between threshold and sum is 26 - 22.99 = 3.01.

Look for the greatest value that is smaller than 3.01 - this can be done by binary search. We need greatest value because we want to minimize the range.

The greatest value is 3 that is smaller than 3.01 and can be found at index 2.

The result is the range [2 1, 6] = [3,6]

Now do this all along the array and minimize upon the length of the range.

Not sure if that is what you already did.

CodePudding user response：

This reminds me of the problem of finding the smallest window that includes all members of a set, which can be solved in O(n) time and O(k) space with two pointers. In our case, O(1) space would seem to be enough since we're only interested in a sum.

Move the right pointer until the sum is achieved. Move the left pointer just until the sum is again too low. Then move the right pointer, etc. Addressing the argmax constraint left to the reader.