Home > OS >  Node deletion is BST, python implementation
Node deletion is BST, python implementation

Time:05-12

I'm curious about this implementation of a node deletion in BST (for additional code/context, see full implementation.) Here's how I understand it:

  1. if val < self.data and elif val > self.data are both cases where the current node isn't the node to be deleted, so they recursively call the delete function on the appropriate child node.

  2. else is the case where we have found the right node and need to perform the deletion.

    a. if self.left is None and self.right is None: return None I'm unclear on what the goal is here. We've returned None but haven't reassigned the node value, itself, to None.

    b. At this point, we've ruled out the possibility that both left and right don't exist and so elif self.left is None is a verbose way to write elif self.right, which returns right. But why? It doesn't reassign the value of the node, itself.

    c. I'm unsure why this last control flow statement is elif self.right is None. Why the absence of an else statement?

  3. This min_val, self.data, self.right dance occurs only when one of the above control flow statements from #2 is not conditionally executed, so I suppose that this is an implicit else statement.

    a. This step really boils down to assigning self.data the minimum value down its right child then assigning self.right the output of a recursive function call, which might be left, right or None from #2.

    def delete(self, val):
        if val < self.data:
            if self.left:
                self.left = self.left.delete(val)
        elif val > self.data:
            if self.right:
                self.right = self.right.delete(val)
        else:
            if self.left is None and self.right is None:
                return None
            elif self.left is None:
                return self.right
            elif self.right is None:
                return self.left

            min_val = self.right.find_min()
            self.data = min_val
            self.right = self.right.delete(min_val)

        return self

To answer this question, please confirm or correct some of my doubts above.

CodePudding user response:

The first cases of the algorithm are about deleting nodes from the tree: removing all references to them while maintaining the BST's key order.

Think about how you'd do this with pencil and paper.

  • If the deleted node has no children, redraw the parent's pointer to point to nothing (None). The deleted node is now "cut out" of the tree. BST order is maintained.

  • If the deleted node has exactly one child, replace the parent's pointer so it now points to that child. Again the deleted node is cut out of the tree, the BST key order is maintained.

Note the replacement of the parent pointers is happening at the recursive calls to delete. E.g. by returning None, the "no child" case is causing the parent to point to nothing. The "cut out" nodes will ultimately be garbage collected by Python.

  • Otherwise you have the more complex case: only one parent of the deleted node and two children. What to do? This particular code finds the next largest key wrt the deleted node and uses it to replace the key to be deleted. The node isn't cut out at all. Then it deletes that moved value from the right subtree, which does cause its node to be cut out. Again BST key order is maintained.

This way of implementing the third case makes clean code, and it probably makes sense in Python where all data access is via reference (pointer). But it's not the only choice. The alternative is to continue the same pattern as the other cases and actually cut out the node containing the deleted value and move the next largest node to that position (not just it's key data). This is a better choice if the node actually contains the data (not just a reference to it), and it's very large. It saves a copy of all that data. If the author had made that choice, there would be another return of that node for re-assignment of the parent's pointer.

  • Related