I was recently given this in an interview and have been unable to solve it. Given a sorted list:
li = [1,2,2,3,4,4,4,4,5,5,5...]
Return the index of all elements matching a target, ex. 4
, with O(log n) time complexity.
The setup of this problem suggested to me that it was a binary search problem. I answered it with something like the below, but haven't been able to come up with anything nicer:
data = [2,3,5,6,8,9,12,12,12,14,17,19,22,25,27,28,33,37]
target = 12
def binary_search(data, target):
low = 0
high = len(data) - 1
while low <= high:
mid = (high low) // 2
if data[mid] == target:
return mid
elif data[mid] > target:
high = mid - 1
else:
low = mid 1
return False
def return_other_indices(data, target, match_index):
output = [match_index]
i = match_index 1
while data[i] == target:
output.append(i)
i = 1
i = match_index - 1
while data[i] == target:
output.append(i)
i -= 1
return output
match_index = binary_search(data, target)
output = return_other_indices(data, target, match_index)
print(output)
Is there a nicer way to do this?
CodePudding user response:
I am not sure if this is what you're looking for, but here is a solution using python standard library.
using bisect
only:
from bisect import bisect_left, bisect
start = bisect_left(data, target)
stop = bisect(data, target, lo=start)
output = range(start, stop)
print(list(output))
output: [6,7,8]
older answer:
It is using bisect.bisect_left
to find the starting point, and itertools.takewhile
to get all elements.
from bisect import bisect_left
from itertools import takewhile, islice
left = bisect_left(data, target)
[left] [i[0] left 1 for i in takewhile(lambda x: x[1]==target,
enumerate(islice(data, left 1, len(data))))]
NB. the linear version with takewhile
might be faster when only a few targets are expected and the data if large, otherwise the double bisect will be faster in general
CodePudding user response:
First of all, you have to take into account what binary search is, and how it works. Now applied to the problem you have to modify looking for the first and the last.
Based on that, perform a search for the first occurrence, and then the last occurrence applying binary search.
By having both indices I make a simple print of the numbers between the interval
def findFirstOccurrence(nums, target):
(left, right) = (0, len(nums) - 1)
result = -1
while left <= right:
mid = (left right) // 2
if target == nums[mid]:
result = mid
right = mid - 1
elif target < nums[mid]:
right = mid - 1
else:
left = mid 1
return result
def findLastOccurrence(nums, target):
(left, right) = (0, len(nums) - 1)
result = -1
while left <= right:
mid = (left right) // 2
if target == nums[mid]:
result = mid
left = mid 1
elif target < nums[mid]:
right = mid - 1
else:
left = mid 1
return result
if __name__ == '__main__':
nums = [2, 5, 5, 5, 5, 5, 6, 6, 8, 9, 9, 9]
target = 5
indexFirst = findFirstOccurrence(nums, target)
indexLast = findLastOccurrence(nums, target)
if(indexLast == -1 or indexFirst == -1):
print("Element not found")
else:
print(*range(indexFirst,indexLast 1))
UPDATE
Here an optimiced form of above:
def optimicedOccurrence(nums, target, type_of_occurrence):
(left, right) = (0, len(nums) - 1)
result = -1
while left <= right:
mid = (left right) // 2
if target == nums[mid]:
result = mid
if(type_of_occurrence):
right = mid - 1
else:
left = mid 1
elif target < nums[mid]:
right = mid - 1
else:
left = mid 1
return result
if __name__ == '__main__':
nums = [2, 5, 5, 5, 5, 6, 6, 8, 9, 9, 9]
target = 5
indexFirst = optimicedOccurrence(nums, target, 1)
indexLast = optimicedOccurrence(nums, target, 0)
if(indexLast == -1 or indexFirst == -1):
print("Element not found")
else:
print(*range(indexFirst,indexLast 1))
CodePudding user response:
Use the bisect module:
from bisect import bisect_left,bisect_right
li = [1,2,2,3,4,4,4,4,5,5,5]
target = 4
print(*range(bisect_left(li,target),bisect_right(li,target)))
4 5 6 7