Home > Software design >  Quicksort implementation bug
Quicksort implementation bug

Time:12-29

After a while, I was trying to implement quickSort on my own. To me the implementation looks fine, but some test cases are apparently failing.

    class MyImplementation {
    public int[] sortArray(int[] nums) {
        quicksort(nums, 0, nums.length - 1);
        return nums;
    }

    private void quicksort(int[] nums, int lo, int hi){
        if(lo < hi){
            int j = partition(nums, lo, hi);
            quicksort(nums, lo, j-1);
            quicksort(nums, j 1, hi);
        }
    }

    private int partition(int[] nums, int lo, int hi){
        int pivot = nums[lo];
        int head = lo   1;
        int tail = hi;
        while(head < tail){
            while(head < hi && nums[head] < pivot)   head;
            while(nums[tail] > pivot) --tail;
            if(head < tail){
                swap(nums, head, tail);
                  head;
                --tail;
            }
        }
        if(nums[head] < pivot)
          swap(nums, lo, head);
        else
          swap(nums, lo, head - 1);  
        return head;
    } 

    private void swap(int[] nums, int i, int j){
        int temp = nums[i];
        nums[i] = nums[j];
        nums[j] = temp;
    }
   }

It's passing in many cases, like these :-

   [5,2,3,1]
   [5,1,1,2,0,0]
   [7,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5] 

but it's failing in the following test cases :-

[5,71,1,91,10,2,0,0,13,45,7]

the output I am getting is :-

[0,0,1,2,5,10,7,71,13,45,91]

instead of :-

[0,0,1,2,5,7,10,13,45,71,91]

I know, I can copy from somewhere to get the correctly working code, but can we figure out why this code is not working from an Algorithmic point of view!! What is the logical flaw in this code? How can we fix this with minimum changes and get a perfectly working code?

CodePudding user response:

That's quite subtle. The bug is here:

private void quicksort(int[] nums, int lo, int hi){
    if(lo < hi){
        int j = partition(nums, lo, hi);
        quicksort(nums, lo, j-1);
        quicksort(nums, j 1, hi);
        //              ^^^  j 1 must be j
    }
}

The most subtle part is that it is not in general necessary to use j as the starting point of the "high" part to recurse on, but your partitioning algorithm requires it. So you will see various examples that recurse on different partitions, and they can be correct as well, in combination with whatever partitioning algorithm they use, but there is not much freedom to mix.

Your partitioning algorithm is based on Hoare's partitioning algorithm, not Lomuto's algorithm. Lomuto's partitioning algorithm puts the pivot in its final position and returns its index, so that index can be skipped in the recursion. Hoare's partitioning algorithm, or your variant of it, does not quite do that. Eg in the case that the else branch at the end of your partitioning algorithm is taken, the pivot ends up at head - 1, while element at head has some arbitrary value not-less-than the pivot, all we know is that it belongs in the "high" partition but it has not been sorted yet and cannot be skipped.

In combination with Hoare's partitioning, it is common to see the recursion on [lo .. p] (where p is the index returned by the partitioning algorithm) and [p 1 .. hi], but that is not suitable in this case because your version of it returns head instead of tail, effectively making p one higher, so the partitions become [lo .. p-1] and [p .. hi].

CodePudding user response:

The currently accepted answer fixes the issue, but actually, the Lomuto algorithm should be able to use j 1 as start for the second partition, since the pivot is expected to be at j and therefore doesn't need to be part of recursive sorts.

The real problem is that your partition method doesn't always return the index of the pivot:

        if(nums[head] < pivot)
          swap(nums, lo, head);
        else
          swap(nums, lo, head - 1);  
        return head;

In case of the else, the pivot is not swapped to index head, but to head - 1, and therefore the returned index should be head - 1 in that case.

Here is a simple fix:

        if(nums[head] < pivot)
          swap(nums, lo, head);
        else
          swap(nums, lo, --head);  
        return head;
  • Related