I'm new to coding, and I'm currently learning searching/sorting algorithms. At the moment, I am working on binary search and had difficulty understanding the time complexity of O(log2 n) for the worst case. Could someone explain to me clearly how to proceed so that I know the logic? (step by step)
Here is the pseudocode:
binarySearch( list, value, low, high ){
if low <= high {
mid = (low high)/ 2
if value == list[mid]
return mid
else if value < list[mid]
return binarySearch( list, value, low, mid - 1 )
else
return binarySearch( list, value, mid 1, high)
}
else
return -1
}
CodePudding user response:
Overview
For a moment, forget about your code and take a look at the algorithm on paper, run it through some examples, take a look at some visualizations until you actually understand it.
Have a look at this illustration which I snacked from
The search range effectively halves in each round. So you find the number in latest log_2(n)
steps.
Guess a number
You might be familiar with the "guess a number game". The game goes like
Think of a number between 0 and 100 but do not tell me the answer yet.
Now, I will try to guess the number and all you can say is:
- yes, correct answer
- no, the number is too high
- no, the number is too low
The optimal strategy for this game is, surprise surprise, binary search. Lets play this through quickly and you will understand better.
The first guess will be 50
, since that is in the middle of the current search range [0, 100]
. I might say 50
is too high now. That effectively halves the search range down to [0, 49]
and the next guess would be 25
.
[0, 100], is it 50?
> too high
[0, 49], is it 25?
> too low
[26, 49], is it 38?
> too high
[26, 37], is it 32?
> too high
[26, 31], is it 28?
> too high
[26, 27], is it 27?
> too high
[26], is it 26?
> correct!
You just have witnessed the worst case, all guesses have been wrong until only 26 was left. We found the solution in 7 steps. log_2(100) = 6.64...
, so this sounds about correct.
So why log_2
?
By now you should have understood that the search range halves in each step. So why does this lead to log_2
mathematically?
It is actually pretty simple. How often can you divide a number by 2
until it is less equals 1
? log_2
times. In other words, when does the search range collapse from n
numbers down to just 1
number - in exactly1 log_2(n)
steps.
1. Technically you have to round it up since there are no "fractions of steps", so ceil(log_2(n))
, not that it matters much though.
Note that, due to how Big-O is defined mathematically, O(log_2(n))
and O(log(n))
denote the same set. So you will often just see O(log(n))
(without the 2
) being specified as time complexity for binary search.
Code
Finally, let us take a look at your pseudocode and convince ourselves that it is actually the same strategy we just used for the "guess a number" game.
You have your list
and low, high
indicate the search range. As first step, you select the value in the middle:
mid = (low high) / 2
in the previous game this was for example how [0, 100]
lead to guessing 50
initially, and so on.
Now, you have the actual guess, the lookup:
if value == list[mid]
depending on if it is higher or lower, you cut the search range in halve by calling the method again with a smaller search range. Namely either with
low, mid - 1
, if the number was too high,- or
mid 1, high
, if the number was too low.
In the game this is the step where we either go to [0, 49]
or to [51, 100]
, based on whether 50
was too high or too low.
CodePudding user response:
Binary search searches a sorted array by repeatedly dividing the search interval in half. It starts with an interval covering the whole array. if the value of the search key is less than the item in the middle of the interval, it narrows the interval to the lower half. Otherwise, it narrows it to the upper half. It repeatedly doing this check until the value is found (and return the index value) or the interval is empty and then returns -1.