I am working on an implementation of recursive quicksort in python. I am working with very large data sets (10,000 - 1,000,000 elements). When feeding it ordered data (i.e. changing an array sorted from largest -> smallest to smallest -> largest) it works fine. But when giving it unordered data, it doesn't seem to work at all.
I'm using a 3-way pivot that counts elements that are equal to the pivot (dutch national flag problem) and, if the data is larger than 101 (remember it's recursive), it uses a median of first, middle, and last elements in the array partition when selecting the pivot.
Slice of output from a random set of 10,000 elements:
129
1
16
7
21
4
16
19
4
15
22
23
1
20
25
27
1
4
15
31
31
37
41
32
42
46
46
44
As you can see, it seems to have a certain pattern to it. I've increased the recursion limit so I know that's not the problem, but it almost seems like it's only partially sorting the input. Like I mentioned, it works fine for an array that's already sorted in reverse order, which is what mystifies me.
My code:
def array_swap(A, i, j):
temp = A[i]
A[i] = A[j]
A[j] = temp
# take median of three elements
def median_three(a, b, c):
return sorted([a, b, c])[1]
def qs_part(A, lo, hi):
mid = lo
p = median_three(A[lo], A[(lo hi) // 2], A[hi]) # pivot set to median of 3, as described in handout
# ('//' must be used for floor division in Python 3 -- it took me longer than I'm willing to admit to figure that out.)
while mid <= lo:
if A[mid] < p:
array_swap(A, mid, lo)
mid = 1
lo = 1
elif A[mid] > p:
array_swap(A, mid, hi)
hi -= 1
else:
mid = 1
return (lo - 1), mid
# Quicksort implementation as described in handout.
# A: Array to sort
# write: name of file to write sorted results to
# out: name of file to write runtime to
# lo: start index (1 after pivot in first case)
# hi: end index
def quick_sort(A,lo, hi):
if lo >= hi:
return # if there are 0 or 1 elements in A
if lo - hi == 1:
if A[lo] < A[hi]: # if 2 elements, sort them
array_swap(A, lo, hi)
return
i, j = qs_part(A, lo, hi)
quick_sort(A, lo, i) # recursively sort partition 1 (elements < pivot)
quick_sort(A, j, hi) # recursively sort partition 2 (elements > pivot)
# no need to sort between 'i' and 'j' because those items == pivot
CodePudding user response:
I figured it out. It was just a small typo in qs_part(). while mid <= lo
should have been while mid <= hi