This is something I do not quite understand. When I read literature on heaps, it always says that the big advantage of a heap is that you have the top (max if max heap) element immediately available. But couldn't you just use a BST and store a pointer to the same node (bottom-rightmost) and update the pointer with insertions/deletions?
If I'm not mistaken, with the BST implementation I'm describing you would have
================================================
| Insert | Remove Max
================================================
Special BST | O(log(n)) | O(1)
================================================
Max Heap | O(log(n)) | O(log(n))
================================================
making it better.
Pseudo-code:
Insert:
Same as regular BST insert, but can keep track of whether
item inserted is max because traversal will be entirely
in the right direction.
Delete
Set parent of max equal to null. Done.
What am I missing here?
CodePudding user response:
Both max heaps and balanced BST’S (eg AVL trees) perform these operations in O(log n) time. But BST’s take a constant factor more space due to pointers and their code is more complicated.
CodePudding user response:
Since you're talking about BST's and not Balanced BST's, consider the following skewed BST:
1
\
2
\
3
\
...
\
n
You can hold a pointer reference to the max (n
-th) element, but if you're inserting a value < n
, it will require O(n)
insertion time in the worst case. Also, to see the max value in the heap, you could simply do heap[0]
(assuming the heap is implemented using an array) to get the max element in O(1)
time for heap as well.
CodePudding user response:
But couldn't you just use a BST and store a pointer to the same node (bottom-rightmost) and update the pointer with insertions/deletions?
Yes, you could.
with the BST implementation I'm describing you would have [...] Remove Max O(1) [...] making it better. [...] Set parent of max equal to null. Done.
No, Max removal wouldn't (always) be O(1), for the following reasons:
After you have removed the Max, you need to also update the pointer to reference the bottom right-most node. For example, take this tree, before the Max is removed:
8 / \ 5 20 <-- Max pointer / / 2 12 / \ 10 13 \ 14
You'll have to find the node with value 14, so to update the Max pointer.
The above operation can be made to be O(1), by keeping the tree balanced, let's say according to the AVL rules. In that case the left child of the previous Max node would not have a right child, and the new Max node would be either its left child, or if it didn't have one, its parent. But as some deletions will make the tree unbalanced, they would need to be followed by a rebalancing operation. And that may involve several rotations. For instance, take this balanced BST:
8 / \ 5 13 / \ / \ 2 6 9 15 <-- Max pointer / \ \ \ 1 4 7 10 / 3
After removal of node 15, it is easy to determine that 13 is the next Max, but the subtree rooted at 13 would not be balanced. After balancing it, the tree as a whole is unbalanced, and another rotation would be needed. The number of rotations could be O(logn).
Concluding, you can use a balanced BST with a Max pointer, but extraction of the Max node is still a O(logn) operation, giving it the same time complexity as the same operation in a binary heap.
Considering that a binary heap uses no pointers, and thus has much less "administrative" overhead than a self-balancing BST, the actual space consumption and runtime of the insert/delete operations will be better by a factor -- while their asymptotic complexity is the same.
Also, a binary heap can be built from a non-sorted array in O(n) time, while building a BST costs O(nlogn).
However, a BST is the way to go when you need to be able to traverse the values in their proper order, or find a value, or find a value's predecessor/successor. A binary heap has worse time complexities for such operations.