Return index of all matching values sorted list-CodePudding

I was recently given this in an interview and have been unable to solve it. Given a sorted list:

li = [1,2,2,3,4,4,4,4,5,5,5...]

Return the index of all elements matching a target, ex. 4, with O(log n) time complexity.

The setup of this problem suggested to me that it was a binary search problem. I answered it with something like the below, but haven't been able to come up with anything nicer:

data = [2,3,5,6,8,9,12,12,12,14,17,19,22,25,27,28,33,37]
target = 12

def binary_search(data, target):
    low = 0 
    high = len(data) - 1

    while low <= high:
        mid = (high   low) // 2
        if data[mid] == target:
            return mid
        elif data[mid] > target:
            high = mid - 1
        else:
            low = mid   1
    
    return False

def return_other_indices(data, target, match_index):
    output = [match_index]

    i = match_index   1
    while data[i] == target:
        output.append(i)
        i  = 1

    i = match_index - 1
    while data[i] == target:
        output.append(i)
        i -= 1

    return output

match_index = binary_search(data, target)
output = return_other_indices(data, target, match_index)
print(output)

Is there a nicer way to do this?

CodePudding user response：

I am not sure if this is what you're looking for, but here is a solution using python standard library.

using `bisect` only:

from bisect import bisect_left, bisect
start = bisect_left(data, target)
stop = bisect(data, target, lo=start)
output = range(start, stop)
print(list(output))

output: [6,7,8]

older answer:

It is using bisect.bisect_left to find the starting point, and itertools.takewhile to get all elements.

from bisect import bisect_left
from itertools import takewhile, islice
left = bisect_left(data, target)
[left] [i[0] left 1 for i in takewhile(lambda x: x[1]==target,
                                       enumerate(islice(data, left 1, len(data))))]

NB. the linear version with takewhile might be faster when only a few targets are expected and the data if large, otherwise the double bisect will be faster in general

CodePudding user response：

First of all, you have to take into account what binary search is, and how it works. Now applied to the problem you have to modify looking for the first and the last.

Based on that, perform a search for the first occurrence, and then the last occurrence applying binary search.

By having both indices I make a simple print of the numbers between the interval

def findFirstOccurrence(nums, target):
    (left, right) = (0, len(nums) - 1)
    result = -1
    while left <= right:
        mid = (left   right) // 2
        if target == nums[mid]:
            result = mid
            right = mid - 1
        elif target < nums[mid]:
            right = mid - 1
        else:
            left = mid   1
    return result
    
def findLastOccurrence(nums, target):
    (left, right) = (0, len(nums) - 1)
    result = -1
    while left <= right:
        mid = (left   right) // 2
        if target == nums[mid]:
            result = mid
            left = mid   1
        elif target < nums[mid]:
            right = mid - 1
        else:
            left = mid   1
    return result


if __name__ == '__main__':
 
    nums = [2, 5, 5, 5, 5, 5, 6, 6, 8, 9, 9, 9]
    target = 5
    
    indexFirst = findFirstOccurrence(nums, target)
    indexLast = findLastOccurrence(nums, target)
    
    if(indexLast == -1 or indexFirst == -1):
        print("Element not found")
    else:
        print(*range(indexFirst,indexLast 1))

UPDATE

Here an optimiced form of above:

def optimicedOccurrence(nums, target, type_of_occurrence):
    (left, right) = (0, len(nums) - 1)
    result = -1
    while left <= right:
        
        mid = (left   right) // 2
        if target == nums[mid]:
            result = mid
            if(type_of_occurrence):
                right = mid - 1
            else:
                left = mid   1
        elif target < nums[mid]:
            right = mid - 1
        else:
            left = mid   1
    return result


if __name__ == '__main__':
 
    nums = [2, 5, 5, 5, 5, 6, 6, 8, 9, 9, 9]
    target = 5
    
    indexFirst = optimicedOccurrence(nums, target, 1)
    indexLast = optimicedOccurrence(nums, target, 0)
    
    if(indexLast == -1 or indexFirst == -1):
        print("Element not found")
    else:
        print(*range(indexFirst,indexLast 1))

CodePudding user response：

Use the bisect module:

from bisect import bisect_left,bisect_right

li = [1,2,2,3,4,4,4,4,5,5,5]

target = 4
print(*range(bisect_left(li,target),bisect_right(li,target)))

4 5 6 7

using bisect only:

older answer:

using `bisect` only: