Find corresponding node in an identical DOM tree - what is the average and worse time complexity?-CodePudding

Here is the question posted on here years ago about an interview question where given a node from a DOM tree, we need to find the node in the same position from an identical DOM tree.

Here is the an iterative solution I came up with

const findCorrespondingNode = (rootA, rootB, nodeA) => {
  let currentNode = nodeA;
  const path = [];
  while (currentNode !== rootA) {
    const index = Array.prototype.indexOf.call(
      currentNode.parentElement.children,
      currentNode
    );
    path.push(index);
    currentNode = currentNode.parentElement;
  }
  currentNode = rootB;
  while (path.length) {
    currentNode = currentNode.children[path.pop()];
  }

  return currentNode;
}

Basically I just walk backwards from the target to root, creating the reverse path. The reverse path is an array where each element is the child index of a node. Travel down the path from rootB to target, popping off the next index until the path array is empty. Finally we return the pointer to the target node.

My question is:

what is the (average) time complexity for this solution? The worse case is easy since we need to visit each node once so it is going to be O(n), n being the number of DOM nodes in the DOM tree. I think it happens when we have two linked lists and the target node is the last node in each linked list. The confusion part for me is that for each level we are also calling Array.prototype.indexOf to get the index, and this might take up to O(D), D being the tree's diameter and for a leaf node it is going to take O((some-constant)n) to get the index. , More importantly, what is the average time complexity? We are traversing a tree twice. It seems like it is going to be the height or the depth of the tree. If it is a completely balanced tree, the height of the tree is going to be logN. Does that mean the average time complexity is logN?
If I write the solution using a recursive DFS approach, where we traverse both trees simultaneously, i.e.

const findCorrespondingNode = (rootA, rootB, target) => {
  if(rootA === target) {
    return rootB;
  }
  for (let i = 0; i < rootA.children.length; i  ) {
    const foundNode = findCorrespondingNode(rootA.children[i], rootB.children[i], target);
    if(foundNode) return foundNode;
  }
}

What is the time complexity in this case? Still worse case O(n) and average/best case O(logN). Is this recursive approach better than the iterative approach in terms of time complexity since we wouldn't need to do indexOf for each level?

CodePudding user response：

what is the (average) time complexity for this solution?

It is O(min(d.logn, n)) where d is the (maximum) branching factor of the nodes (2 for binary trees). The indexOf call is responsible for this coefficient. However, even if d is large, the nodes that are scanned during indexOf are never visited again, so the time complexity is also O(n) (as an upper bound). For relatively smaller d (in comparison with n), O(d.logn) better expresses the time complexity. To cover both extremes we can say: O(min(d.logn, n)). This expresses the fact that if d approaches n, then necessarily logn becomes small (the tree's height diminishes).

The worse case is easy since we need to visit each node once so it is going to be O(n), n being the number of DOM nodes in the DOM tree. I think it happens when we have two linked lists and the target node is the last node in each linked list.

True.

The confusion part for me is that for each level we are also calling Array.prototype.indexOf to get the index, and this might take up to O(D), D being the tree's diameter and for a leaf node it is going to take O((some-constant)n) to get the index.

The diameter is not concerned here. The complexity of indexOf depends on the number of siblings. In the case of a degenerate tree that really is a linked list (as you wrote), D is 1, i.e. none of the nodes have siblings, and so indexOf is always called on an array with just one element: indexOf takes constant time in that case.

We are traversing a tree twice.

A factor of 2 is irrelevant for deriving a time complexity.

It seems like it is going to be the height or the depth of the tree. If it is a completely balanced tree, the height of the tree is going to be logN. Does that mean the average time complexity is logN?

Yes. Even "almost" balanced trees, like AVL-trees, red-black trees, ... still give this logarithmic time complexity. If you create a tree randomly, it is expected that it will be rather balanced, and its height is O(logN).

If I write the solution using a recursive DFS approach, where we traverse both trees simultaneously, [...] What is the time complexity in this case?

Here you don't make use of the parent links, and so in the worst case you may have to visit each node. This makes it O(n).

Is this recursive approach better than the iterative approach in terms of time complexity since we wouldn't need to do indexOf for each level?

The indexOf strategy isn't that bad if d is much smaller than n. If however we have no idea at all whether that is the case, then the worst-case time complexity is the same -- O(n).

If d is much smaller than n, then the first algorithm has a better time complexity, O(d.logn).