Time Complexity of Binary Search-CodePudding

I'm new to coding, and I'm currently learning searching/sorting algorithms. At the moment, I am working on binary search and had difficulty understanding the time complexity of O(log2 n) for the worst case. Could someone explain to me clearly how to proceed so that I know the logic? (step by step)

Here is the pseudocode:

binarySearch( list, value, low, high ){ 

  if low <= high { 
     mid = (low   high)/ 2

     if value == list[mid]
         return mid

      else if value < list[mid]
         return binarySearch( list, value, low, mid - 1 )

      else
          return binarySearch( list, value, mid 1, high)
      } 

  else
  return -1 
}

CodePudding user response：

Overview

For a moment, forget about your code and take a look at the algorithm on paper, run it through some examples, take a look at some visualizations until you actually understand it.

Have a look at this illustration which I snacked from

The search range effectively halves in each round. So you find the number in latest log_2(n) steps.

Guess a number

You might be familiar with the "guess a number game". The game goes like

Think of a number between 0 and 100 but do not tell me the answer yet.

Now, I will try to guess the number and all you can say is:

yes, correct answer

no, the number is too high

no, the number is too low

The optimal strategy for this game is, surprise surprise, binary search. Lets play this through quickly and you will understand better.

The first guess will be 50, since that is in the middle of the current search range [0, 100]. I might say 50 is too high now. That effectively halves the search range down to [0, 49] and the next guess would be 25.

[0, 100], is it 50?
> too high

[0, 49], is it 25?
> too low

[26, 49], is it 38?
> too high

[26, 37], is it 32?
> too high

[26, 31], is it 28?
> too high

[26, 27], is it 27?
> too high

[26], is it 26?
> correct!

You just have witnessed the worst case, all guesses have been wrong until only 26 was left. We found the solution in 7 steps. log_2(100) = 6.64..., so this sounds about correct.

So why `log_2`?

By now you should have understood that the search range halves in each step. So why does this lead to log_2 mathematically?

It is actually pretty simple. How often can you divide a number by 2 until it is less equals 1? log_2 times. In other words, when does the search range collapse from n numbers down to just 1 number - in exactly¹ log_2(n) steps.

_{1. Technically you have to round it up since there are no "fractions of steps", so ceil(log_2(n)), not that it matters much though.}

Note that, due to how Big-O is defined mathematically, O(log_2(n)) and O(log(n)) denote the same set. So you will often just see O(log(n)) (without the 2) being specified as time complexity for binary search.

Code

Finally, let us take a look at your pseudocode and convince ourselves that it is actually the same strategy we just used for the "guess a number" game.

You have your list and low, high indicate the search range. As first step, you select the value in the middle:

mid = (low   high) / 2

in the previous game this was for example how [0, 100] lead to guessing 50 initially, and so on.

Now, you have the actual guess, the lookup:

if value == list[mid]

depending on if it is higher or lower, you cut the search range in halve by calling the method again with a smaller search range. Namely either with

low, mid - 1, if the number was too high,
or mid 1, high, if the number was too low.

In the game this is the step where we either go to [0, 49] or to [51, 100], based on whether 50 was too high or too low.

CodePudding user response：

Binary search searches a sorted array by repeatedly dividing the search interval in half. It starts with an interval covering the whole array. if the value of the search key is less than the item in the middle of the interval, it narrows the interval to the lower half. Otherwise, it narrows it to the upper half. It repeatedly doing this check until the value is found (and return the index value) or the interval is empty and then returns -1.

Overview

Guess a number

So why log_2?

Code

So why `log_2`?