Home > Mobile >  fastest non-destructive method to get sorted version of a heap?
fastest non-destructive method to get sorted version of a heap?

Time:12-28

I have a priority heap holding an event queue.

I need to dump this out for the user in order, but without rendering the heap unusable.

Obviously if I was willing to destroy it I could simply dequeue events until it was empty, and add them in order to my sorted list. But then of course the heap is gone. Further, Quicksort is so much faster than a heap sort that I don't have confidence that I can build this sorted list faster than I can make a copy of the heap and sort the copy.

One idea I had was to in fact destroy the heap by dequeueing all its items, but then... replacing the now-empty priority queue with the resulting sorted list, which should maintain the heap property (of cell i being a higher priority than cell i * 2 1 and i * 2 2). So I'm also wondering whether such a heap would perform better than a regular heap.

The easiest solution is just to copy the heap array, and sort the copy. But some sorts do a bad job when given sorted or somewhat-sorted data, and I'm wondering whether the Standard C library's sort (or C qsort()) could be trusted to handle sorting a heap as efficiently?

CodePudding user response:

With the usual heap implementation, just sort the heap in place. A sorted list satisfies the heap condition for a min-heap. Alternately if you sort the heap descending, you satisfy the heap condition for a max-heap. And you can always sort it one way and traverse another if that is what you need.

Note that the sort::heap documentation warns about breaking the heap condition. Be careful that you know you haven't if you are changing the heap data in place.

CodePudding user response:

It looks to me like you're concerned about a performance problem that isn't really a problem. As I understand it, modern C implementations of sort use Introsort, which avoids the pathological worst-case times of a naïve Quicksort. And the difference between Quicksort and Heapsort, in the context of generating user output, is not large enough to be a concern.

Just copy the heap and sort it. Or call sort::heap and output the result, provided of course that doing so doesn't break the heap.

You asked if a sorted heap performs better than a non-sorted heap. No. When adding an item, you still add it as the last node and sift it up. Half of the nodes in a heap are at the leaf level and assuming a uniform distribution of new items, then half of the items you add will end up at the leaf level, requiring no swaps. Worst case is if every item you add ends up being the smallest (in a min-heap), in which case every time you add an item it will take log(n) swaps to move it to the root. Now, if every item added is larger than any other item in the heap, then of course addition is O(1). But that's true regardless of whether the heap was initially created from a sorted array.

Deleting an item from the heap requires that you replace the root item with an item from the leaf level and then sift it down. In a sorted heap, the likelihood that the replacement item will end up back down at the leaf level is very high, which means that adjusting the heap will require the maximum log(n) swaps. A sorted heap almost guarantees that removal will require the maximum number of swaps. In this case, a sorted heap is potentially worse in terms of performance than a heap constructed from a randomly-arranged array.

But all that changes quickly as you begin adding items to and removing items from the heap. The heap becomes "not sorted" fairly quickly.

Over the life of the priority queue, it's highly unlikely that the initial order of items will make any noticeable difference in the performance of your binary heap.

  • Related