How can I speed up my implementation of the problem statement below? I have a correct solution that passes every test for small inputs. However, it exceeds the time limits for larger inputs. My current implementation is quadratic in the size of the array.
Problem statement
You have an unsorted array
arr of non-negative integers and a number s
. Find a longest contiguous subarray in arr
that has a sum equal to s
. Return two integers that represent its inclusive bounds. If there are several possible answers, return the one with the smallest left bound. If there are no answers, return [-1].
Your answer should be 1-based, meaning that the first position of the array is 1 instead of 0.
Implementation
def findLongestSubarrayBySum(s, arr):
"""Return a two-element array that represent the bounds of the subarray."""
ans = [-1]
for left in range(len(arr)):
curr_sum = arr[left]
if curr_sum == s and len(ans) == 1:
ans = [left 1] * 2
for right in range(left 1, len(arr)):
curr_sum = arr[right]
if curr_sum == s:
# First found soltion
if len(ans) == 1:
ans = [left 1, right 1]
# Left bound is right bound
elif ans[1] == ans[0]:
ans = [left 1, right 1]
# Longer subarray
elif ans[1] - ans[0] < right - left:
ans = [left 1, right 1]
elif curr_sum > s:
break
return ans
if __name__ == '__main__':
# s = 12 # ans = [2, 4]
# arr = [1, 2, 3, 7, 5]
# s = 15 # ans = [1, 5]
# arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# s = 15 # ans = [1, 8]
# arr = [1, 2, 3, 4, 5, 0, 0, 0, 6, 7, 8, 9, 10]
# s = 3 # ans = -1
# arr = [1, 1]
# s = 3 # ans = -1
# arr = [2]
# s = 468 # ans = [42, 46]
# arr = [135, 101, 170, 125, 79, 159, 163, 65, 106, 146, 82, 28,
# 162, 92, 196, 143, 28, 37, 192, 5, 103, 154, 93, 183, 22,
# 117, 119, 96, 48, 127, 0, 172, 0, 139, 0, 0, 70, 113, 68,
# 100, 36, 95, 104, 12, 123, 134]
# s = 3 # ans = [1, 1]
# arr = [3]
# s = 0 # ans = [2, 2]
# arr = [1, 0, 2]
s = 3 # ans = [1, 3]
arr = [0, 3, 0]
print(findLongestSubarrayBySum(s, arr))
CodePudding user response:
If the input contains negative numbers, use prefix sums conceptually and a hash table in practice. prefixes_j - prefixes_i = s
. For each prefixes_j
, which is simply a single, accumulating sum, check if the key, prefixes_j - s
, exists in the hash table. The value stored at that key will be the first index where that prefix sum was seen, that way, the leftmost index is returned for duplicates.
[1, 2, 3, 7, 5] s = 12
sum 1
hash 1
sum 3
hash 1, 3
sum 6
hash 1, 3, 6
sum 13
hash contains 1
found 13 - 12 = 1
record length 3
etc...
If the input doesn't contain negative numbers, use the idea that the right pointer can stop as soon as the sum exceeds s, then increment the left pointer until it's smaller again, alternating pointer increments that way, so each pointer traverses the list once in total.
CodePudding user response:
Here's an implementation of the two-pointer approach that's linear in time and constant in extra space overhead. It's similar to your solution, except we don't regrow the window from size 0 on every loop iteration. Since you only want the earliest, largest window, the right pointer only ever increases (rather than increase and decrease), making the time complexity easy to prove.
def findLongestSubarrayBySum(s, arr):
"""Return a two-element array that represent the bounds of the subarray."""
ans = [-1, -2]
s -= arr[0]
right = 0
for left in range(len(arr)):
while right < len(arr) - 1 and s >= arr[right 1]:
s -= arr[right 1]
right = 1
if s == 0 and right - left > ans[-1] - ans[0]:
ans = [left 1, right 1]
s = arr[left]
if ans[0] == -1:
return [-1]
return ans
CodePudding user response:
You may use two pointer method whose time complexity will be O(n)
and space complexity will be O(1)
. A similar problem and two pointer solution can be found here.