Home > Back-end >  How to find the worst case permutation for the MergeSort algorithm
How to find the worst case permutation for the MergeSort algorithm

Time:12-06

I am getting a recursion error and when reassigning the recursive limit, I get a memory error when trying to run the following code.

def join(A, left, right, l, m, r):
    x = 0
    for x in range(m-l):
        A[x] = left[x]
    for j in range(r-m):
        A[x j] = right[j]enter code here

def split(A, left, right, l, m, r):
    for i in range(0, m-l, 1):
        left[i] = A[i*2]
    for i in range(0, r-m, 1):
        right[i] = A[i*2 1]

def generateWorstCase(A, l, r):
    if l < r:
        m = int(l   (r-1) / 2)
        left = [0 for i in range(m - l   1)]
        right = [0 for i in range(r - m)]
        split(A, left, right, l, m, r)
        generateWorstCase(left, l, m)
        generateWorstCase(right, m 1, r)
        join(A, left, right, l, m, r)

arr = [1, 2, 3, 4, 5, 6, 7, 8]
generateWorstCase(arr, 0, len(arr)-1)
print(arr)

I tried translating the example given from geeksforgeeks https://www.geeksforgeeks.org/find-a-permutation-that-causes-worst-case-of-merge-sort/, and I am still confused about writing the code in python. I understand the fundamentals of how it works (as in it causes the mergeSort algorithm to compare the highest amount). I appreciate any tips to help with this.

CodePudding user response:

Fixing the main mistake

The way you calculate your indices is wrong.

m = int(l   (r-1) / 2)

Let's try this with actual numbers; for instance:

l = 100
r = 110
m = ?    # should be in the middle, maybe 104 or 105?

m = int(l   (r-1)/2) 
m = int(100   109/2)
m = int(100   54.5)
m = 154  # wrong

This is just an error with parentheses. To fix it:

m = (l   r) // 2
m = (100   110) // 2
m = 105

Note it's better to use a // b than int(a / b). Operator / is floating-point division in python3. Operator // is integer division. We have no need for floating-points here, so stick with integer division.

General debugging advice

Next time you run into a similar issue, I recommend you try to test the code by yourself. I know of three ways to do that: by hand, or with print, or with a debugger.

By hand

Grab a pen and paper. On your paper, write down a small array A, with maybe 6 elements. Write down l = 0, r = len(A) - 1 = 5. Then read your code and execute it in your head as if you were a computer, making notes on your paper. When you read m = int(l (r-1) / 2), write the resulting m = 154 on your paper. When you arrive at a recursive call generateWorstCase(left, l, m), draw a horizontal line, and start again with the recursive call: A = [...], l = 0, r = ... Since the array A is small enough, you should either be able to run the whole algorithm by hand, ending with a sorted array, or to notice when something becomes wrong (such as m being 154 instead of 104 or 105).

With print

Add calls to print in your code, to print the successive values taken by the variables during execution, and figure out when something goes wrong. Add a few prints at first, and if that's not enough to figure out the problem, add more prints. More and more prints until you can figure out when the issue arises.

For instance:

def generateWorstCase(A, l, r, depth=0):
    print('  '*depth, 'generateWorstCase', 'A=', A, '; l=', l, '; r=', r)
    if l < r:
        m = int(l   (r-1) / 2)
        print('  '*depth, '                 ', 'm=', m)
        left = [0 for i in range(m - l   1)]
        right = [0 for i in range(r - m)]
        split(A, left, right, l, m, r)
        generateWorstCase(left, l, m, depth 1)
        generateWorstCase(right, m 1, r, depth 1)
        join(A, left, right, l, m, r)

With a debugger

There exist programs called "debuggers" which automate this whole process: they execute the code very slowly, allow you to pause during execution, display the values of every variable during execution, and lots of other cool stuff to help you get a better look at what's going on and find your errors.

Fixing your function join

Your function join is not correct. It just concatenates the two arrays left and right without doing any hard work. I'd like to point out something important about mergesort and quicksort. If we summarize those two algorithms, they're pretty similar:

Sort(a):
    split a in two halves
    recursively sort first half
    recursively sort second half
    merge the two halves

So what's the difference between mergesort and quicksort? The difference is where the actual work happens:

  • In quicksort, elements are compared when splitting, so that all the elements in the first half are smaller than all the elements in the second half; then the two halves can simply be concatenated.
  • In mergesort, the array can be split randomly, as long as about half the elements are in each half; elements are compared when merging, so that merging two sorted halves results in one sorted array.

In shorter words:

  • In quicksort, split does the work, and join is trivial;
  • In mergesort, split is trivial, and merge does the work.

Now, in your code, the join function simply concatenates the two halves. That's wrong. Elements should be compared. In fact, if we look at your whole code, there is never any comparison of any elements. So, no chance that the list will be correctly sorted. Toying around with indices doesn't do anything to sort the list. At some point, you have to compare elements, with something like if a[i] < a[j]: or if left[i] < right[j]:; otherwise, how is your algorithm going to find which elements are big and which elements are small, in order to sort the array?

Final code

Python has lots of facilities to deal with lists, such as slices, list comprehensions, or looping over the elements of a list without actually referring to indices. Using these, splitting a list into two halves becomes a lot much easier. It's especially easy because for the mergesort algorithm, it doesn't matter which elements end up in which half, so you have a lot of freedom.

Here is an example modification on your code:

def split(a):
    m = len(a) // 2 
    left = a[:m]
    right = a[m:]
    return left, right

def merge(a, left, right):
    li = 0
    ri = 0
    i = 0
    while li < len(left) and ri < len(right):
        if left[li] < right[ri]:
            a[i] = left[li]
            li  = 1
        else:
            a[i] = right[ri]
            ri  = 1
        i  = 1
    while li < len(left):
        a[i] = left[li]
        li  = 1
        i  = 1
    while ri < len(right):
        a[i] = right[ri]
        ri  = 1
        i  = 1

def mergesort(a):
    if len(a) > 1:
        left, right = split(a)
        mergesort(left)
        mergesort(right)
        merge(a, left, right)

Testing:

a = [12, 3, 7, 8, 5, 4, 9, 1, 0]
print(a)
# [12, 3, 7, 8, 5, 4, 9, 1, 0]
mergesort(a)
print(a)
# [0, 1, 3, 4, 5, 7, 8, 9, 12]

As I mentioned, for the purpose of mergesort, you can split the array any way you want, it doesn't matter. It's only the merging that needs to be done carefully. So, here are two alternatives for the split function:

def split(a):
    m = len(a) // 2 
    left = a[:m]
    right = a[m:]
    return left, right

def split(a):
    even = a[::2]
    odd = a[1::2]
    return even, odd

I strongly encourage you to figure out the difference between these two versions of split, by adding print in the code to figure out how the elements are moved around.

CodePudding user response:

The main problem is a typo here: m = int(l (r-1) / 2).

Do not use l as an identifier because it looks confusingly similar to 1 in many fonts. The correct formula to compute the middle index is:

    m = l   (r-l) // 2

Note that using the integer division // instead of / avoids the need for the conversion int().

There is another bug in the join function: for x in range(m-l) will forget the last item in the slice because m is included rather than excluded. The ubiquitous convention in mergesort implementations to include the slice boundaries is confusing, causing off by one errors such as this one. Consider using r as the index of the first element after the slice.

There are more problems in the code, namely a confusion between temporary arrays and slices of the array A. It is simpler to reason with temporary arrays only.

Here is a simplified version:

def generateWorstCase(A):
    n = len(A)
    if n > 1:
        m = n // 2
        left = [A[i*2] for i in range(n-m)]
        right = [A[i*2 1] for i in range(m)]
        A = generateWorstCase(left)   generateWorstCase(right)
    return A

arr = [1, 2, 3, 4, 5, 6, 7, 8]
print(generateWorstCase(arr))

Output: [1, 5, 3, 7, 2, 6, 4, 8]

This code can be further simplified by taking advantage of Python's superior expressivity:

def generateWorstCase(A):
    return A if len(A) <= 1 else generateWorstCase(A[::2])   generateWorstCase(A[1::2])

arr = [1, 2, 3, 4, 5, 6, 7, 8]
print(generateWorstCase(arr))
  • Related