Home > OS >  SQL - first common ancestor in hierarachy
SQL - first common ancestor in hierarachy

Time:03-09

I am attempting to use a CTE to find the first common ancestor in a hierarchy for all values in a source table. I am able to return the entire hierarchy, but cannot seem to find a way to get the first common ancestor.

The ultimate goal is to update the source table to use only common ancestors. Source table:

Type_Id    Id    Parent_Id    Group_Id
A          3     2            1
A          4     2            1
A          5     4            1

After the CTE the table I get is:

Type_Id    Id    Child_Id    Parent_Id    Group_Id    Level
A          3     3           2            1           1
A          4     4           2            1           1
A          5     5           4            1           1
A          5     4           2            1           2

What I'm looking to do is update the id of the source so that for each type and group combination, the new id is to be the first common parent. In this case, it would be set all ids to 2 as that's the first common entry.

CodePudding user response:

You can try to use cte recursive self-join by your logic.

;WITH CTE as (
   SELECT Type_Id,Id,id Child_Id,Parent_Id,Group_Id,1 Level 
   FROM T
   UNION ALL
   SELECT c.Type_Id,c.Id,t.id Child_Id,t.Parent_Id,t.Group_Id,c.Level 1
   FROM CTE c 
   INNER JOIN T t
   ON c.Parent_Id = t.ID AND c.Type_Id = t.Type_Id
)
SELECT *
FROM CTE

sqlfiddle

CodePudding user response:

By building a list of all ancestor relationships (including self) and identifying the depth of each node from the root, you can then query all ancestors for the N selected nodes of interest. Any ancestors occurring N times are candidate common ancestors. The one with the deepest depth is the answer.

The following is what I came up with using simplified Node data containing only Id and ParentId. (The initial CTE was adapted from D-Shih's post.)

;WITH Ancestry as (
   -- Recursive list of ancestors (including self) for each node
   SELECT ChildId = N.Id, AncestorLevel = 0, AncestorId = N.Id, AncestorParentId = N.ParentId 
   FROM @Node N
   UNION ALL
   SELECT C.ChildId, AncestorLevel = C.AncestorLevel   1, AncestorId = P.Id, AncestorParentId = P.ParentId 
   FROM Ancestry C
   INNER JOIN @Node P ON P.Id = C.AncestorParentId
),
AncestryDepth AS (
    -- Calculate depth from root (There may be a quicker way)
    SELECT *, Depth = COUNT(*) OVER(PARTITION BY ChildId) - AncestorLevel
    FROM Ancestry
),
Answer AS (
    -- Answer is deepest node that is an ancestor of all selected nodes
    SELECT TOP 1 Answer = A.AncestorId
    FROM @Selected S
    JOIN AncestryDepth A ON A.ChildId = S.Id
    GROUP BY A.AncestorId, A.Depth
    HAVING COUNT(*) = (SELECT COUNT(*) FROM @Selected) -- Ancestor covers all
    ORDER BY A.Depth DESC -- Deepest
)
SELECT *
FROM Answer

There may be room for improvement, possibly a more direct calculation of depth, but this appears to give the correct result.

See this db<>fiddle for a working demo with data and several test scenarios.

  • Related