I'm working on an implementation of Binary Search in Python as part of a course (Algorithmic Toolbox on Coursera).
The challenge is to create an implementation of Binary Search that returns the index of a query (an integer) in an array. So for example, if I call Binary Search with an array of [1,5,7] and my query is 5, Binary Search should return 1. If the target is not in the array, then it should return -1. For example, I give it [1, 67, 88, 200] and my target is 999, then it should return -1.
This problem operates on the assumption that all test cases are feeding it arrays that are sorted and do not have duplicates.
My current implementation uses a helper function to actually conduct the Binary Search. This helper returns either the target value itself if it can be located, or -1 if it can't be found. The main function calling the helper takes the result, and if it isn't -1, it looks up the original index in a dictionary I create at the start of the main function.
In my own private test cases, the code functions with correct results, however upon submission, I am told that the code simply takes too long to run to be considered acceptable.
So my request is this: please review my code, and please help me figure out how to change it to run more efficiently. My code is posted below:
def single_binary_search(keys, target):
'''
Takes an array of integers, keys, and searches for a value, target, in the array. If the target is found to be
within the array, it returns the target value. Else, it returns -1 if the array is considered invalid or if the
target value does not exist within the keys array.
Input:
keys: an array of integers sorted in increasing order
target: an integer we wish to ascertain is within keys
Returns:
the target value if it is located, or -1 if it is not or if the array is invalid
'''
start = 0 # define start index
end = len(keys)-1 # define end index
if end < start: # check if array is invalid or if target is not in keys. If so, return -1.
return -1
mp = start (end-start)//2 # calculate the midpoint index
if target == keys[mp]: # check to see if the target is at the midpoint.
return keys[mp] # return the target value if located in keys
elif target < keys[mp]: # if target is less than mp, recursively call binary search on the lower array
return single_binary_search(keys[:mp], target)
else: # if target is greater than the mp, recursively call binary search on the upper array
return single_binary_search(keys[mp 1:], target)
def binary_search(keys, query):
keys_dictionary = {k: v for v, k in enumerate(keys)} # Create a dictionary of keys to track the indices
result = single_binary_search(keys, query) # search for the individual query in keys and store the result
if result != -1: # if we found the target, use the keys dictionary to look up the index and return it
return keys_dictionary[result]
else: # if the target was not found or the keys array was invalid, return -1
return result
An earlier implementation tried to add two new arguments, start = 0, end = None
, to the single_binary_search()
function, but I scrapped it because I kept running into endless recursions that I knew how to fix only by tacking on specific if
conditions to account for exceptions.
CodePudding user response:
Here is a more conventional binary search (see also Wikipedia) that will work better:
def binary_search(keys, query):
L, R = 0, len(keys) - 1
while L <= R:
M = L (R - L) // 2
if keys[M] == query:
return M
elif query < keys[M]:
R = M - 1
else:
L = M 1
return -1
print("keys:", list(range(0,20,2)))
print(14, binary_search(list(range(0,20,2)), 14))
print(15, binary_search(list(range(0,20,2)), 15))
print(-1, binary_search(list(range(0,20,2)), -1))
print(20, binary_search(list(range(0,20,2)), 20))
print(0, binary_search(list(range(0,20,2)), 0))
print(18, binary_search(list(range(0,20,2)), 18))
Test case output:
keys: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
14 7
15 -1
-1 -1
20 -1
0 0
18 9
Some issues in your implementation are that slicing is expensive and best avoided, and also that your dictionary mechanic seems unnecessary, since binary search operates on the indexes of the array being searched, and therefore you shouldn't need to do any extra work to get the index of the match (if any).
CodePudding user response:
Creating the argument keys[:mp] is a slicing operation that creates a copy and takes time linear in the size of the copy. So your algorithm actually takes time n/2 n/4 n/8 …=O(n). If you passed only the indices of the first and last elements of new subarray, the algorithm would take logarithmic time. You can use a list and pass indices as arguments.