Home > database >  Quick sort not working as intended with linked list
Quick sort not working as intended with linked list

Time:09-16

Preface:

I do not care about the viability of my particular quick sort algorithm. That is not what the question is about. I am curious as to why the program I have already written behaves the way it does, not about why I should use the STL or whatever. This is for learning purposes. (for instance, I'm aware picking pivot as the head is not the best case -- that's not what I'm asking though)

Question:

I have a linked list composed of Node*s. I've written a quick sort algorithm for it specifically. It works by taking a given linked list, and recursively splitting it into sublists based on a chosen pivot. The pivot is always the head (front) of the list, and the sublists left and right are lists that contain values less than and greater than/equal to the pivot, respectively. I've almost gotten it down, but I don't know how to properly append the pivots. In a previous iteration, I would just append the pivot to the right sublist always, but that caused an infinite loop, so I ended up skipping the pivots entirely when comparing values against it. The resulting list is sorted, but it is missing every value that was used for a pivot. How should I fix this?

Here is a minimal reproducible example:

#include <iostream>
struct Node {
    Node(int d=int(), Node* n=nullptr) {
        data = d;
        next = n;
    }
    int         data;
    Node* next;
};

struct List {
  Node* head;
  size_t      size;

 List(Node* h=new Node, size_t s=0) {
    size = s;
    head = h;
  }

  ~List() {
    Node* current = head;
    while (current != 0) {
      Node* next = current->next;
      delete current;
      current = next;
    }
    head = 0;
  }
};

// Prototypes

Node* sort_q(Node* head);
void  partition(Node* head, Node* pivot, Node*& left, Node*& right);
Node* concatenate(Node* left, Node* right);

// helpful function calls quick sort
void sort(List& l) {
    l.head = sort_q(l.head);
}

// appends node to end of head, and changes end value of that head's list
// returns false if there head = null; true if append was successful
bool append(Node* head, Node* append, Node*& end) {
    if (head == nullptr) {
        return false;
    }
    Node* curr = head;
    for (; curr->next; curr = curr->next) {}
    curr->next = append;
    end = curr->next;
    return true;
}

Node* sort_q(Node* head) {

    Node* left = nullptr;
    Node* right = nullptr;
    Node* pivot = head;

  // base case list = null or head->null
    if (head == nullptr || head->next == nullptr) {
        return head;
    }

    partition(head, pivot, left, right);

    left = sort_q(left);

    right = sort_q(right);

    return concatenate(left, right);
}

void partition(Node* head, Node* pivot, Node*& left, Node*& right) {
    left = new Node, right = new Node;
  // lend is pointer to end of left sublist
  // rend is pointer to end of right sublist
    Node* lend = nullptr, * rend = nullptr;

  // loop through list until end
    while (head != nullptr) {
        if (head == pivot) {
            head = head->next; continue;
        }
        if (head->data < pivot->data) {
            // break current link
            Node* tmp = head;
            head = head->next;
            tmp->next = nullptr;
      // if sublist is not empty, append current node to sublist and update end pointer
            if (append(lend, tmp, lend))
                continue;
        // otherwise, "append" current node to empty sublist and update end pointer manually
            left->next = tmp;
            lend = left->next;
        }
        else {
            // break current link
            Node* tmp = head;
            head = head->next;
            tmp->next = nullptr;
            // if sublist is not empty, append current node to sublist and update end pointer
            if (append(rend, tmp, rend))
                continue;
            // otherwise, "append" current node to empty sublist and update end pointer manually
            right->next = tmp;
            rend = right->next;
        }
    }
    // remove dummy node(s)
    if (left->next)
        left = left->next;
    else
        left = pivot;
    if (right->next)
        right = right->next;
    else
        right = pivot;
    // unlink pivot
    pivot->next = nullptr;
}

// connected sublists
Node* concatenate(Node* left, Node* right) {
    Node* head;

    head = left;

    while (left->next != nullptr) {
        left = left->next;
    }

    left->next = right;

    return head;
}

void print(List &l) {
  for (Node* n = l.head; n != NULL; n = n->next) {
    std::cout << n->data << '\n';
  }
}

int main() {
  // set up linked list 8->4->5->11->7->5->3->9->null
  Node* head = new Node(8 , new Node(4, new Node(5, new Node(11, new Node(7, new Node(5, new Node(3, new Node(9))))))));
  List l(head,5);

  sort(l);
  print(l);

  return 0;
}

The output I want to have is

3
4 // pivot missing from output
5
5
7
8 // pivot missing from output
9
11

but it outputs

3
5
5
7
9
11

Edit #1:

I have tried adding the pivot between concatenations, but that does not work either. It produces different results, but not quite the same. Modifying sort_q() and partition() like so:

Node* qsort(Node* head, bool numeric) {
   // ... other stuff before modification
   return concatenate(left, pivot); // changed right argument to pass pivot which is now pivot->next = right once finishing partition()
}

void partition(Node* head, Node*& pivot, Node*& left, Node*& right, bool numeric) {
    // .. other stuff before modifications

    // remove dummy node
    if (left->next)
        left = left->next;

    if (right->next)
        right = right->next;

    pivot->next = right; // pivot points to right (greater) sublist
}

The output becomes

3
4
5
7
0
8
11
0

CodePudding user response:

You have assumptions that you can get both a left and a right list, and you already make sure you have at least 2 items in the list before you call partition, so that is good.

You just need to handle all the cases at the end of partition. Make sure pivot ends up in one of the two lists. Almost like you had before, but you need to handle the case where both left and right lists are non-empty.

// remove dummy node(s)
    pivot->next = nullPtr;
    if (left->next && right->next) {
        // we have two lists. put pivot on right
        left = left->next;
        pivot->next = right->next;
        right = pivot;
    }
    else if (left->next) {
        // only a left list, use pivot on right
        left = left->next;
        right = pivot;
    }
    else if (right->next) {
        // only a right list, use pivot as left.
        right = right->next;
        left = pivot;
    }
    

seeing the above, handling all 3 cases, it can be simplified to this:

// remove dummy node(s)
    if (left->next) {
        // we have left node, so put pivot on right
        left = left->next;
        pivot->next = right->next;
        right = pivot;
    }
    else if (right->next) {
        // only a right list, use pivot as left.
        right = right->next;
        left = pivot;
        pivot->next = nullPtr;
    }
    

then use concatenate(left,right) afterwards. Although it could be optimized to have partition concatenate the lists and just return the head, but that's a bit more of a rewrite.

EDIT: putting it all together to eliminate your memory leak, it should be like this (note, I have not compiled or tested)

void partition(Node* head, Node* pivot, Node*& left, Node*& right) {
    left = nullptr, right = nullptr;
  // lend is pointer to end of left sublist
  // rend is pointer to end of right sublist
    Node* lend = nullptr, * rend = nullptr;

  // loop through list until end
    while (head != nullptr) {
        if (head == pivot) {
            head = head->next; continue;
        }
        if (head->data < pivot->data) {
            // break current link
            Node* tmp = head;
            head = head->next;
            tmp->next = nullptr;
      // if sublist is not empty, append current node to sublist and update end pointer
            if (append(lend, tmp, lend))
                continue;
        // otherwise, "append" current node to empty sublist and update end pointer manually
            lend = left = tmp;
        }
        else {
            // break current link
            Node* tmp = head;
            head = head->next;
            tmp->next = nullptr;
            // if sublist is not empty, append current node to sublist and update end pointer
            if (append(rend, tmp, rend))
                continue;
            // otherwise, "append" current node to empty sublist and update end pointer manually
            rend = right = tmp;
        }
    }
    // insert pivot node
    if (left) {
        // we have left node, so put pivot on right
        pivot->next = right;
        right = pivot;
    }
    else if (right) {
        // only a right list, use pivot as left.
        left = pivot;
        pivot->next = nullptr;
    }
}

CodePudding user response:

The main problem I see is that you sometimes, but not always, add the pivot to the "left" or "right" list in partition(). This inconsistency likely violates the desired functionality of partition(). However, this functionality is not documented. So I'll revise my assessment to say the main problem is that you have not documented your functions. The bigger your project, the more likely you are to encounter problems because of the lack of documentation.

Document!

I would expect to see partition() documented to say that it partitions the list into three pieces: left, right, and pivot.

/// Partitions the list starting at `head` into three pieces. Data less than
/// the given partition's data is put in the `left` list, while the remaining
/// data (except `*partition` itself) is put in the `right` list. The third
/// piece consists of `*partition`, which is assumed to be in the given list.
void partition(Node* head, Node* pivot, Node*& left, Node*& right) {

Another option might be to immediately put the pivot in one of these lists. However, these lists are going to be sorted immediately after the call to partition(), and there is no need to include *partition in that sort. Sort left, sort right, then concatenate left, partition, and right. In addition to being faster because there are fewer nodes to sort in the next recursive call, keeping the partition node out of your lists will guarantee that your lists get shorter with each recursive call.

It's your call, though, since you are the designer. The important thing is to decide what functionality you want, document it, and follow that decision.

Enforce consistency!

Once you have specified functionality, it is easier to verify that your code respects whatever design decisions you have made. Your sort_q() function fails to account for the above specification in that it does not add the partition to the list. There is probably a simpler way to do this than the below, but the below serves the purpose with little re-writing of your code. (This is the end of sort_q().)

    // ** Return left -> pivot -> right, not just left -> right **
    return concatenate(left, concatenate(pivot, right));
}

In addition, your partition() function violates this specification by (sometimes) adding the partition node to one of your lists. Stop doing that. Set your links to null instead.

    // remove dummy node(s)
    if (left->next)
        left = left->next;
    else
        left = nullptr;   // <-- null, not pivot
    if (right->next)
        right = right->next;
    else
        right = nullptr;   // <-- null, not pivot

Hmm... I had avoided re-writing your code earlier, but this particular structure strikes me as needlessly complicated. If a certain pointer is not null, assign it to something. Otherwise (when that pointer is null), assign null instead of the pointer (which is null, so same thing). That is rather wordy. Just assign the pointer.

    // remove dummy node(s)
    left = left->next;
    right = right->next;
    // *** Memory leak, as we removed the nodes without deleting them ***

Oh that memory leak is a pain. Especially since you don't really need the dummy nodes. But that digresses into a separate question, so for now I'll just note that your code is very close to working if you had initialized left and right to nullptr. (Your hint is that if left is initialized to nullptr then, at the moment append() is called, lend is null if and only if left is null.)

One final piece: if you've followed all this so far, you probably ran into a segmentation fault at some point. You are missing a check in concatenate() (maybe the reason you started using dummy nodes?)

// connected sublists
Node* concatenate(Node* left, Node* right) {
    if ( left == nullptr )   // <-- Add this check
        return right;        // <-- It is very easy to concatenate to nothing.
    Node* head = left;

That should address the immediate question. There are other places to improve, though, so keep working at it. (There are other parts of your code that I would call "needlessly complicated", albeit less severely – and less obviously – than what that if statement became.)

  • Related