Binary Search through multiple elements-CodePudding

I had this question on an exam,

I have a very slow computer which takes one second for binary search on an array with 1,000 (one thousand) elements. How long should I expect the computer to take for binary search on an array with 1,000,000 (one million) elements?

I have trouble understanding this, would it take longer to search through 1 million elements than 1,000 or does it depend on the computer?

CodePudding user response：

It does depend on the computer and also depends on the size of the array. Since in this question the same computer is used, we can abstract the effects of the computer and focus on the sample size.

Binary search has logarithmic time complexity. If you take the difference between log(1_000) and log(1_000_000) you will see that you should expect almost double the time to search in the 1 million elements array.

This is assuming worst case. For the average case, the calculation gets more complex, you can check this wikipedia page for a deeper analysis.

CodePudding user response：

First, a binary search requires the list or array to be sorted. And since you are dividing each list/sublist in half it will take log2(N) searches to find the correct item where log2 is log to the base 2.

So for 1_000_000 items it should only take about 20 comparisons to home in on the item. So it should be very quick (sans the time to sort the list to begin with).

1000 items would take about 10 comparisons. And one second, imo, is much to long to do such a simple search on any modern day processor.

CodePudding user response：

Binary search has O(log(N)) time complexity, completion time is

t = c * log(N, 10)

In given case for N = 1000 we know t and thus we can find out c

1 = c * log(1000, 10) 

1 = c * 3

c = 1/3

For N = 1000000 we have

t = 1 / 3 * log(1000000, N) =
  = 1 / 3 * 6 = 
  = 2

So we can expect that binary search within 1000000 items will be completed in 2 seconds.

Please, note that O(log(N)) == O(log(N, m)) since log(N, m) == log(N, k) / log(m, k), which means that when working with O(log(N)) we are free to choose logarithm base. In our case (1000 and 1000000) it's convenient to use base 10 (or 1000)