Let me explain my scenario.
I have a simple table with 5.5 million records. The structure is straightforward.
relations
------------
id
parent_id
name
I have a recursive function to get all the children ids of a node; that function works fine in every case. That's not the problem here.
CREATE OR REPLACE FUNCTION children_ids(id INTEGER)
RETURNS INT[]
LANGUAGE plpgsql
AS
$$
DECLARE
ids INT[];
r RECORD;
BEGIN
FOR r IN WITH RECURSIVE t
AS
(
SELECT *
FROM relations sa
WHERE sa.id = id
UNION ALL
SELECT next.*
FROM t prev
JOIN relations next ON (next.parent_id = prev.id)
)
SELECT t.id FROM t
LOOP
ids := ids || r.id;
END LOOP;
RETURN ids;
END
$$;
Then I need to query another table to get the count of the records in any of the children ids returned by the function:
The query that I want to do is:
SELECT COUNT(p.*) FROM books AS p WHERE p.id = ANY(children_ids(20));
But it fails due to a timeout. The weird thing here is that this same query in my localhost is working fine, the databases were cloned and the PostgreSQL version is the same.
I have tested another variants and those are the result:
SELECT children_ids(20);
-- WORKS: {20}
SELECT COUNT(p.*) FROM books AS p WHERE p.id = ANY(children_ids(20));
-- FAILS: TIMEOUT
SELECT COUNT(*) FROM books AS p WHERE p.id = ANY(children_ids(20));
-- FAILS: TIMEOUT
SELECT * FROM books AS p WHERE p.id = ANY(children_ids(20));
-- FAILS: TIMEOUT
SELECT COUNT(p.*) FROM books AS p WHERE p.id = ANY(ARRAY[20]);
-- WORKS: 0
SELECT COUNT(p.*) FROM books AS p WHERE p.id IN (20);
-- WORKS: 0
SELECT COUNT(*) FROM books AS p WHERE p.id IN (20);
-- WORKS: 0
Any ideas of what could be happening here?
CodePudding user response:
You probably have a cycle in your data, so that your recursion runs into an endless loop. See the documentation for how to break the cycle. With PostgreSQL v14 or better, the simplest way would be
WITH RECURSIVE t AS (
SELECT ...
UNION ALL
SELECT ...
) CYCLE id SET is_cycle USING path