Home > Back-end >  SQL Recursive CTE: stop recursion on all branches when one branch fulfills some condition?
SQL Recursive CTE: stop recursion on all branches when one branch fulfills some condition?

Time:11-05

I have a table of parent-child relationships, and for a list of nodes (let's call them "roots"), I want to see which ones have one (or more) parents with some characteristic (let's call it "bad").

I can write a recursive CTE to find all parent nodes, and then compare them against a list of bad nodes. However, this is slow, as a node can have very many parents, and I am not really interested in finding all of them. I only need to know if some parent is bad. I would like to stop the recursion for a particular root when a bad parent is found.

example of a structure

Here is a mock-up of what the query could look like:

with recursive_table (root, level, path, current_node) as (
    select root = node, level = 0, path = 'root', current_node = node
    from seed_nodes
        UNION ALL
    select o.root, level   1, CONCAT(e.parent, '/', o.path), e.parent
    from structure e
    inner join recursive_table o
    on o.current_node = e.child

    -- it is possible to stop searching this particular branch if we pass a bad node
    and o.node not in ( 
        select b.node
        from bad_node_list b
        )
    
    -- it is NOT possible to stop investigating a particular root if any bad node is found
    and o.root not in (
        select oo.root
        from recursive_table oo 
        join bad_node_list b
        on oo.current_node = b.node
        where o.root = oo.root
        )
    
    ) 
select *
from seed_nodes a
outer apply (
    select top 1 current_node
    from recursive_table b
    join bad_node_list c
    on b.current_node = c.node
    where a.node = b.root
    ) b

As the code in the comments say, it is possible to stop searching in a branch when a bad node is found. But it is not possible to stop searching the root altogether when the first bad node is found. SQL Server will give the error "Recursive member of a common table expression 'recursive_table' has multiple recursive references."

Is there any way around this? (I know that SQL server tries to disallow some other functions, like aggregate functions, but that it's possible to get around that and still use them.)

CodePudding user response:

Is there any way to access the other branches in a recursive function? No.

Recursion in SQL Server is implemented so that this is not possible. Additionally, recursion actually happens depth-first, and not breadth-first (as the syntax of SQL recursion might have you believe).

It is possible to rewrite this as an actual breadth-first search, by using a loop, and include the condition to stop searches when the first bad node is found.

For my particular set of data, this did not improve performance, and was instead around 50% more time-consuming (15 minutes instead of 10 minutes).

  • Related