Home > Software design >  Weird problem using ANY operator in SQL query with PostgreSQL
Weird problem using ANY operator in SQL query with PostgreSQL

Time:10-25

Let me explain my scenario.

I have a simple table with 5.5 million records. The structure is straightforward.

relations
------------
id
parent_id
name

I have a recursive function to get all the children ids of a node; that function works fine in every case. That's not the problem here.

CREATE OR REPLACE FUNCTION children_ids(id INTEGER)
RETURNS INT[]
LANGUAGE plpgsql
AS
$$
DECLARE
    ids INT[];
    r RECORD;
BEGIN
    FOR r IN WITH RECURSIVE t
    AS
    (
        SELECT * 
          FROM relations sa
         WHERE sa.id = id
         UNION ALL
        SELECT next.*
          FROM t prev
          JOIN relations next ON (next.parent_id = prev.id)
    )
    SELECT t.id FROM t
    LOOP
        ids := ids || r.id;
    END LOOP;

    RETURN ids;
END
$$;

Then I need to query another table to get the count of the records in any of the children ids returned by the function:

The query that I want to do is:

SELECT COUNT(p.*) FROM books AS p WHERE p.id = ANY(children_ids(20)); 

But it fails due to a timeout. The weird thing here is that this same query in my localhost is working fine, the databases were cloned and the PostgreSQL version is the same.

I have tested another variants and those are the result:

SELECT children_ids(20);
-- WORKS: {20}

SELECT COUNT(p.*) FROM books AS p WHERE p.id = ANY(children_ids(20)); 
-- FAILS: TIMEOUT

SELECT COUNT(*) FROM books AS p WHERE p.id = ANY(children_ids(20)); 
-- FAILS: TIMEOUT

SELECT * FROM books AS p WHERE p.id = ANY(children_ids(20));
-- FAILS: TIMEOUT

SELECT COUNT(p.*) FROM books AS p WHERE p.id = ANY(ARRAY[20]);
-- WORKS: 0

SELECT COUNT(p.*) FROM books AS p WHERE p.id IN (20);
-- WORKS: 0

SELECT COUNT(*) FROM books AS p WHERE p.id IN (20);
-- WORKS: 0

Any ideas of what could be happening here?

CodePudding user response:

You probably have a cycle in your data, so that your recursion runs into an endless loop. See the documentation for how to break the cycle. With PostgreSQL v14 or better, the simplest way would be

WITH RECURSIVE t AS (
   SELECT ...
   UNION ALL
   SELECT ...
) CYCLE id SET is_cycle USING path
  • Related