Walking through the code interviews of IT international companies I run into interesting problem.
How many comparisons do we have to make to figure out what element out of six is the second smallest or the second largest.
None of these six elements have the same value.
we have main function with six arguments (v1, ..., v6)
def solve(v1, v2, v3, v4, v5, v6): # the worst case -> 9 comparisons if isLarger(v1, v2): v1, v2 = v2, v1 if isLarger(v1, v3): v1, v3 = v3, v1 if isLarger(v1, v4): v1, v4 = v4, v1 if isLarger(v1, v5): v1, v5 = v5, v1 if isLarger(v1, v6): v1, v6 = v6, v1 if isLarger(v2, v3): v2, v3 = v3, v2 if isLarger(v2, v4): v2, v4 = v4, v2 if isLarger(v2, v5): v2, v5 = v5, v2 if isLarger(v2, v6): v2, v6 = v6, v2 print(f"#comparisons = {CntComparisons}") return v2
which returns the second smallest or the second largest value.
Determine this value by comparison (i.e. it cannot use indexing by that value).
For pairwise comparison we can use only the below function
CntComparisons = 0 def isLarger(v1, v2): global CntComparisons CntComparisons = 1 return v1 > v2
Values are compared only by calling the comparison function isLarger(v1, v2).
The goal is to find an algorithm that requires (even in the worst case) as few comparisons as possible!
Any ideas or hint how to solve this?
CodePudding user response:
Finding the best element takes invariably n-1
comparisons. The second best element is among those who lose the comparison only to the best.
By arranging an elimination tournament you guarantee that there are at most log n
candidates for the second best, and finding the best among them takes log n
comparisons.
Therefore, for 6 elements you need 5 2 = 7 comparisons.
As of expressing this in the code, here is a first round of elimination:
if isLarger(v1, v4):
v1, v4 = v4, v1
if isLarger(v2, v5):
v2, v5 = v5, v2
if isLarger(v3, v6):
v3, v6 = v6, v3
Now next two matches are a bit trickier. You'd have to swap the corresponding losers from the first round as well, eg
if isLarger(v2, v3):
v2, v3 = v3, v2
v5, v6 = v6, v5
to maintain the invariants like v2 < v5
.
After that the candidates are v2, v3, v4
(or just v2, v4
, if the leader v1
was not displaced). In two comparisons find the best among them.
CodePudding user response:
A trivial, but not optimal way is using the first two passes of bubble sort: this will swap pairs of variables and so bubble the 2 greatest values to the "right" and assign them to v5 and v6, and so v5 will be returned as correct answer. This requires 9 comparisons: 5 in the first pass, 4 in the second. So we have an upper bound of 9 comparisons.
A lower bound is 5, since that is the number of comparisons needed to find either the minimum or the maximum, and that will be needed to be sure a value is the second least or second greatest.
Here is an idea:
Perform 2 to 3 comparisons to sort v1, v2, v3 through swaps from least to greatest. We then know that v2 is not the least nor the greatest.
Perform 3 comparisons between v2 and the last 3 values (v3, v4 and v5).
If these comparisons all give the same boolean result, it means that v2 is indeed the answer.
If among those boolean results there is only one True (i.e. v2 is only greater than one of them), then let vx be the one among v3, v4 or v5 for which v2 is greater than vx. Return the greatest of v1 and vx: this needs another, 7th comparison.
If among those boolean results there is only one False (i.e. v2 is greater than two of them), then let vx be the one among v3, v4 or v5 for which v2 is not greater than vx. Return the least of v3 and vx: this needs another, 7th comparison.
So in the worst case we get the result with 7 comparisons. The best case needs 5 comparisons (2 for the sorting step, and c4, c5 and c6).
Here is an implementation of this idea:
def solve(v1, v2, v3, v4, v5, v6):
# Sort (v1, v2, v3)
if isLarger(v1, v2):
v1, v2 = v2, v1
if isLarger(v2, v3):
v2, v3 = v3, v2
if isLarger(v1, v2):
v1, v2 = v2, v1
# v2 is now certainly not the greatest nor the least
# Check how it compares with v4, v5 and v6
c4 = isLarger(v2, v4)
c5 = isLarger(v2, v5)
c6 = isLarger(v2, v6)
if c4 == c5 == c6: # All true or all false
return v2
if c4 ^ c5 ^ c6: # Only one of them is true
vx = v4 if c4 else (v5 if c5 else v6)
return v1 if isLarger(v1, vx) else vx
else: # Only one of them is false
vx = v4 if not c4 else (v5 if not c5 else v6)
return vx if isLarger(v3, vx) else v3
I ran this on all permutations of [1,2,3,4,5,6]
and these are the results:
- 5 comparisons needed for 96 permuations
- 6 comparisons needed for 336 permutations
- 7 comparisons needed for 288 permutations
On average: 6.266667 comparisons
CodePudding user response:
First of all if you want to find the kth largest or kth smallest element from a list or an array there are mainly two methods
- Naive (Casual Approach)
- Ingenious method (Modern Approach)
The naive method is to pop out the element each time after sorting for k times. It goes like,
def myfun(l):
for i in range(k):
l=sorted(l)
l.pop()
return sorted(l)[0]