Linked List v.s. Binary Search Tree Insertion Time Complexity-CodePudding

Linked List

A linked list’s insertion time complexity is O(1) for the actual operation, but requires O(n) time to traverse to the proper position. Most online resources list a linked list’s average insertion time as O(1):

https://stackoverflow.com/a/17410009/10426919
https://www.bigocheatsheet.com/
https://www.geeksforgeeks.org/time-complexities-of-different-data-structures/

BST

A binary search tree’s insertion requires the traversal of nodes, taking O(log n) time.

Problem

Am I mistaken to believe that insertion in a BST also takes O(1) time for the actual operation?

Similar to the nodes of a linked list, an insertion of a node in a BST will simply point the current node’s pointer to the inserted-node, and the inserted-node will point to the current node’s child node.

If my thinking is correct, why do most online resources list the average insert time for a BST to be O(log n), as opposed to O(1) like for a linked list?

It seems that for linked list, the actual insertion time is listed as the insertion time complexity, but for BST, the traversal time is listed as the insertion time complexity.

CodePudding user response：

A binary search tree is ordered, and it's typically balanced (to avoid O(n) worst-case search times), which means that when you insert a value some amount of shuffling has to be done to balance out the tree. That rebalancing takes an average of O(log n) operations, whereas a Linked List only needs to update a fixed number of pointers once you've found your place to insert an item between nodes.

CodePudding user response：

To insert into a linked list, you just need to maintain the end node of the list (assuming you are inserting at the end).

To insert into a binary search tree (BST), and to maintain the BST after insertion, there is no way you can do that in O(1) - since the tree might re-balance. This operation is not as simple as inserting into a linked list.

Check out some of the examples here.

CodePudding user response：

The insertion time of a Linked List is actually depends on where you are inserting and the types of Linked List.

For example consider the following cases:

You are using a single linked list and you are going to insert at the end / middle, you would have running time of O(n) to traverse the list till the end node or middle node.
You are using double linked list (with two pointer first pointer points to head element and second pointer points to last element) and you are going to insert at the end, this time still you would have O(n) time complexity since you need to traverse to the middle of the list using either first or second pointer.
You are using single linked list and you are going to insert at the first position of the list, this time you would have complexity of O(1) since you don't need to traverse any node at all. The same is true for double linked list and insert position at the end of the list.

So you can see in worst cases scenario a Linked list would take O(n) instead of O(1).

Now in case of BST you can come up with O(log n) time if your BST is balanced and not skewed. If your TREE is skewed (where every elements are greater than the prev elements) this time you need to traverse all the nodes to find the insertion position. For example consider your tree is 1->2->4->6 and you are going to insert node 9, so you need to visit all the nodes to find to insertion position.

1
 \
  2
   \
    4
     \
      6 (last position after which new node going to insert)
       \
        9 (new insertion position for the new node)

Therefore you can see you need to visit all the nodes to find the proper place, if you have n-nodes you would have O(n 1) => O(n) running time complexity.

But if your BST is balanced and not skewed the situation changes dramatically, since every move you can eliminate nodes which is not comes under condition.

PS: What I mean by not comes under the condition you can take this as homework!

CodePudding user response：

It reflects the usage. It's O(1) and O(log n) for the operations you'll actually request from them.

With a BST, you'll likely let it manage itself while you stay out of the implementation details. That is, you'll issue commands like tree.insert(value) or queries like tree.contains(value). And those things take O(log n).

With a linked list, you'll more likely manage it yourself, at least the positioning. You wouldn't issue commands like list.insert(value, index), unless the index is very small or you don't care about performance. You're more likely to issue commands like insertAfter(node, newNode) or insertBeginning(list, newNode), which do only take O(1) time. Note that I took these two from Wikipedia's Linked list operations > Singly linked lists section, which doesn't even have an operation for inserting at a certain position given as an index. Because in reality, you'll manage the "position" (in the form of a node) with the algorithm that uses the linked list, and the time to manage the position is attributed to that algorithm instead. That can btw also be O(1), examples are:

You're building a linked list from an array. You'll do this by keeping a variable referencing the last node. To append the next value/node, insert it after that last node (an O(1) operation indeed), and update your variable to reference the new last node instead (also O(1)).
Imagine you don't find a position with a linear scan but with a hash map, storing references directly to linked list nodes. Then looking up the reference takes O(1) and inserting after the looked-up node also again only takes O(1) time.

If you had shown us some of those "Most online resources list a linked list’s average insertion time as O(1)", we'd likely see that they're indeed showing insertion operations like insertAfterNode, not insertAtIndex. Edit now that you included some links in the question: My thoughts on those sources regarding the O(1) insertion for linked lists: The first one does point out that it's O(1) only if you already have something like an "iterator to the location". The second one in turn refers to the same Wikipedia section I showed above, i.e., with insertions after a given node or at the beginning of a list. The third one is, well, the worst site about programming I know, so I'm not surprised they just say O(1) without any further information.

Put differently, as I like real-world analogies: If you ask me how much it costs to replace part X inside a car motor, I might say $200, even though the part only costs $5. Because I wouldn't do that myself. I'd let a mechanic do that, and I'd have to pay for their work. But if you ask me how much it costs to replace the bell on a bicycle, I might say $5 when the bell costs $5. Because I'd do the replacing myself.