Home > Mobile >  (python) quicksort working for ordered data, but not for unordered data
(python) quicksort working for ordered data, but not for unordered data

Time:04-14

I am working on an implementation of recursive quicksort in python. I am working with very large data sets (10,000 - 1,000,000 elements). When feeding it ordered data (i.e. changing an array sorted from largest -> smallest to smallest -> largest) it works fine. But when giving it unordered data, it doesn't seem to work at all.

I'm using a 3-way pivot that counts elements that are equal to the pivot (dutch national flag problem) and, if the data is larger than 101 (remember it's recursive), it uses a median of first, middle, and last elements in the array partition when selecting the pivot.

Slice of output from a random set of 10,000 elements:

129
1
16
7
21
4
16
19
4
15
22
23
1
20
25
27
1
4
15
31
31
37
41
32
42
46
46
44

As you can see, it seems to have a certain pattern to it. I've increased the recursion limit so I know that's not the problem, but it almost seems like it's only partially sorting the input. Like I mentioned, it works fine for an array that's already sorted in reverse order, which is what mystifies me.

My code:

def array_swap(A, i, j):
    temp = A[i]
    A[i] = A[j]
    A[j] = temp


# take median of three elements
def median_three(a, b, c):
    return sorted([a, b, c])[1] 
    


def qs_part(A, lo, hi):
    mid = lo
    p = median_three(A[lo], A[(lo   hi) // 2], A[hi]) # pivot set to median of 3, as described in handout 
    # ('//' must be used for floor division in Python 3 -- it took me longer than I'm willing to admit to figure that out.)
    
    while mid <= lo:
        if A[mid] < p:
            array_swap(A, mid, lo)
            mid  = 1
            lo   = 1
        elif A[mid] > p:
            array_swap(A, mid, hi)
            hi  -= 1
        else:
            mid  = 1
    return (lo - 1), mid
    

# Quicksort implementation as described in handout.
# A: Array to sort
# write:    name of file to write sorted results to
# out:      name of file to write runtime to
# lo:       start index (1 after pivot in first case)
# hi:       end index
def quick_sort(A,lo, hi):
    if lo >= hi:
        return # if there are 0 or 1 elements in A
    
    if lo - hi == 1:
        if A[lo] < A[hi]: # if 2 elements, sort them
            array_swap(A, lo, hi)
        return
    
    i, j = qs_part(A, lo, hi)
    quick_sort(A, lo, i) # recursively sort partition 1 (elements < pivot)
    quick_sort(A, j, hi) # recursively sort partition 2 (elements > pivot)
    # no need to sort between 'i' and 'j' because those items == pivot

CodePudding user response:

I figured it out. It was just a small typo in qs_part(). while mid <= lo should have been while mid <= hi

  • Related