Home > Software design >  How to write a recursive CTE query to find the parent record?
How to write a recursive CTE query to find the parent record?

Time:07-26

I am having trouble writing a query which would return the parental id. I have a table that looks like this:

CREATE TABLE csx
(
    id INT UNSIGNED PRIMARY KEY, 
    cd VARCHAR(11), 
    category VARCHAR(20),
    lvl INT, 
    parent_id INT
)

INSERT INTO csx 
VALUES
(1,"ab-00-00-00",'ab',1,null),
(2,"ac-00-00-00",'ac',1,null),
(3,"ac-01-00-00",'ac',2,2),
(4,"ac-01-00-01",'ac',3,3),
(5,"ac-01-00-02",'ac',3,3),
(6,"ac-03-00-00",'ac',2,2),
(7,"ac-03-00-01",'ac',3,6),
(8,"ac-03-00-02",'ac'3,6),
(9,"ac-02-00-00",'ac'2,2),
(10,"ac-02-00-01",'ac',3,9)

I want to check whether parent_id (referencing the id of the entry) is correct.

I am new to recursive CTEs (I think those have to be used). Could you please shed some light on the correct way to implement the CTE which would return parental ids?

CodePudding user response:

Instead of trying to verify values after the fact, ensure there can never be invalid ParentIDs with a foreign key :

CREATE TABLE csx
(
    id INT PRIMARY KEY, 
    cd VARCHAR(11), 
    category VARCHAR(20),
    lvl INT, 
    parent_id INT references csx(id)
)

If you wanted to ensure there are no invalid ParentIDs in an existing table, a simple LEFT JOIN would be enough to find all problems:

SELECT t1.*
from csx t1 left join csx t2 on t1.ParentID=t2.ID
where t2.ID is null

This will return all rows with a non-existent ParentID. The FOREIGN KEY on the other hand ensures there won't be any invalid values in the first place.

To calculate levels and paths you can use a recursive CTE. A CTE is more-or-less a subquery that can be referenced by name and used in multiple places. A recursive CTE is a CTE that refers to itsel.

To get all root items and their children, the following CTE first selects all roots, then joins the actual table with itself to retrieve the children :

with cte as (
    select csx.* ,
        1 as Level,
        cast(ID as varchar(200)) as Path
    from csx
    where parent_id is null
    union all
    select csx.* ,
        cte.Level 1 as Level,
        cast(CONCAT_WS('/',cte.Path, csx.ID) as varchar(200)) As Path
    from csx inner join cte on cte.ID=csx.parent_id 
)
select * from cte
order by path
id cd category lvl parent_id Level Path
1 ab-00-00-00 ab 1 NULL 1 1
2 ac-00-00-00 ac 1 NULL 1 2
3 ac-01-00-00 ac 2 2 2 2/3
4 ac-01-00-01 ac 3 3 3 2/3/4
5 ac-01-00-02 ac 3 3 3 2/3/5
6 ac-03-00-00 ac 2 2 2 2/6
7 ac-03-00-01 ac 3 6 3 2/6/7
8 ac-03-00-02 ac 3 6 3 2/6/8
9 ac-02-00-00 ac 2 2 2 2/9
10 ac-02-00-01 ac 3 9 3 2/9/10

The first query selects the roots and sets the root values for the Level (1) and Path (ID). The next query joins the table with the CTE to match roots and children.

The question's cd column isn't a path though. It looks like a row number inside each parent's direct children. Calculating row numbers is the job of the ROW_NUMBER function. Since we're counting inside a parent's children, we can use ROW_NUMBER() OVER(PARTITION BY Parent_ID ORDER BY ID).

with cte as (
    select csx.* ,
        1 as Level,
        cast(category as varchar(200)) as Path
    from csx
    where parent_id is null
    union all
    select csx.* ,
        cte.Level 1 as Level,
        cast(CONCAT_WS('-',cte.Path, ROW_NUMBER() OVER(PARTITION BY csx.Parent_ID ORDER BY csx.ID)) as varchar(200)) As Path
    from csx inner join cte on cte.ID=csx.parent_id 
)
select * from cte
order by path;

This produces

id cd category lvl parent_id Level Path
1 ab-00-00-00 ab 1 NULL 1 ab
2 ac-00-00-00 ac 1 NULL 1 ac
3 ac-01-00-00 ac 2 2 2 ac-1
4 ac-01-00-01 ac 3 3 3 ac-1-1
5 ac-01-00-02 ac 3 3 3 ac-1-2
6 ac-03-00-00 ac 2 2 2 ac-2
7 ac-03-00-01 ac 3 6 3 ac-2-1
8 ac-03-00-02 ac 3 6 3 ac-2-2
9 ac-02-00-00 ac 2 2 2 ac-3
10 ac-02-00-01 ac 3 9 3 ac-3-1

Unfortunately, the values don't match. With ORDER BY ID, the row with ID 6 will have a path ac-2 instead of ac-3. Changing the order will break all other rows. There's no other indicator that could be used to determine the row number of the children, at least not in this table.

This means that either rows 6-10 are all wrong, or thatcd can't be used to determine if Parent_ID is wrong. It doesn't contain identifiers but calculated values. The only way to say if the data match these, is to try and reproduce them. Unfortunately, there's not enough information in the table to do so

CodePudding user response:

Use SQL Cursor to loop through records Refer below link for example and details https://docs.microsoft.com/en-us/sql/t-sql/language-elements/declare-cursor-transact-sql?view=sql-server-ver16

CodePudding user response:

 ;WITH   CTE( id, parent_id ,category ) 
AS (
   
  SELECT id, parent_id ,category 
  FROM csx
  WHERE id = 10 -->> desire ID which you want to check its parent_id. for example I choose it: 10

  UNION ALL

  
  SELECT t.id, t.parent_id, t.category 
  FROM CTE p
  JOIN csx t
  ON t.id = p.parent_id 
)
SELECT *
FROM CTE
WHERE parent_id IS NULL

-- result get: 2 NULL ac

  • Related