Home > database >  how expensive are the postgres WITH RECURSIVE?
how expensive are the postgres WITH RECURSIVE?

Time:12-15

I have a product system that each category can have multiple subcategory layers if there aren't any products attached to them, and we can attach as many products as we want to a category if there aren't any sub-categories attached to it like the example below:

PS: I deleted other columns/constraints from the two tables for simplicity

      CREATE TABLE categories (
        id serial PRIMARY KEY,
        name VARCHAR NOT NULL,
        parent_id INT
    );
    CREATE TABLE products (
        id serial PRIMARY KEY,
        name VARCHAR NOT NULL,
        category_id serial,
        CONSTRAINT fk_category FOREIGN KEY(category_id) REFERENCES categories(id)
    );
INSERT INTO categories (
    id,
    name,
    parent_id
)
VALUES
    (1, 'Electronic Devices', NULL),
    (2, 'lap-top', 1),
    (3, 'smart-phones', 1),
    (4, 'headphones', 1),
    (5, 'desctops', 1),
    (6, '16inc laptops', 2),
    (7, '13inc laptops', 2),
    (8, 'smart phones', 3),
    (9, 'phablets', 3),
    (10, 'wireless headphones', 4),
    (11, 'wired headphones', 4),
    (12, 'onear headphones', 10),
    (13, 'large headphones', 10);


INSERT into products (
    id, name, category_id
) VALUES 
    (1, 'mac-book-16 model 1', 6),
    (2, 'mac-book-16 model 2', 6),
    (3, 'mac-book-16 model 3', 6),
    (4, 'mac-book-13 model 1', 7),
    (5, 'mac-book-13 model 2', 7),
    (6, 'mac-book-13 model 3', 7),
    (7, 'mac-book-13 model 4', 7),
    (8, 'iphone 10', 8),
    (9, 'iphone 11', 8),
    (10, 'iphone 12', 8),
    (11, 'iphone 13', 8),
    (12, 'iphone 14', 8),
    (13, 'iphone 10 pro max', 9),
    (14, 'iphone 11 pro max', 9),
    (15, 'iphone 12 pro max', 9),
    (16, 'iphone 13 pro max', 9),
    (17, 'iphone 14 pro max', 9),
    (18, 'galegxy note 8', 9),
    (19, 'galegxy note 9', 9),
    (20, 'galegxy note 10', 9),
    (21, 'some headphones', 12),
    (22, 'samsung galexypods', 12),
    (23, 'apple earpods', 12),
    (24, 'apple earpod pro', 13);

Let's say I want to get all the products related to "Electronic Devices" how I'm planning to do it like below:

WITH RECURSIVE products_in_category AS (
    SELECT
        p.*,
        c.parent_id
    FROM
        products p
    INNER join categories c on c.id = p.category_id
    WHERE 
        category_id = 1
    UNION
        SELECT
            p2.*,
            c2.parent_id
        FROM
            products p2
        INNER join categories c2 on c2.id = p2.category_id
        INNER JOIN products_in_category s ON s.category_id = c2.parent_id
) SELECT
    *
FROM
    products_in_category limit 25;

I'm expecting to see all 24 rows in products, but I'm getting 0.

A: Is it possible with WITH RECURSIVE? , if it is how?

B: Is it a good way to do it this way(I do create indexes on my id fields)? is it scaleable (when we have 500 categories & 10k products for example)? if not what are the alternatives?

CodePudding user response:

To answer the indirect question of "why doesn't this work"...


First, you should only recurse that part of the structure that needs it; the categories in your case

  • build the tree of categories first, inside the recursive CTE
  • join the products to it outside of the CTE

Second, your expected results don't need a recursive query, just a join.

SELECT
  p.*,
  c.parent_id
FROM
  products   p
INNER join
  categories c
    ON c.id = p.category_id

To get all products within an arbitrary category (and its children categories)...

WITH RECURSIVE
  recursed_category AS
(
  SELECT
    id     AS root_category_id,
    name   AS root_category_name,
    id     AS category_id,
    name   AS category_name
  FROM
    categories

  UNION ALL

  SELECT
    p.root_category_id,
    p.root_category_name,
    c.id,
    c.name
  FROM
    recursed_category   AS p  -- parent
  INNER JOIN
    categories          AS c  -- child
      ON c.parent_id = p.category_id
)
SELECT
  c.*,
  p.id    AS product_id,
  p.name  AS product_name
FROM
  recursed_category   AS c  -- category
INNER join
  products            AS p  -- product
    ON c.id = p.category_id
WHERE
  c.root_category_id = ???

This builds multiple trees, one for each category, and travels to every child node of each category.

Then in the WHERE clause you can specify which one you're actually interested in.

As SQL is Declarative, the optimiser willl apply the outer WHERE clause to the CTE, and not build all the trees then throw some of them away.

Then, it's the simple query again; just take the category tree and join the products on, in the outer query.

NOTE: This expressly starts at a category and browses down. If you wanted to know which categories a product is in, do NOT use this. Make your recursive CTE similarly, but start at the children and browse up. Otherwise, the optimiser won't be able to push the where clause up to the CTE.

CodePudding user response:

In this example as you can see on my Git Repos also : https://github.com/WalterMatsuda/WITH-RECURSIVE-x-SELECT

The difference of the same operation made with a LOOP and subqueries is : In a 1000000 Rows tables with 5 levels of hierarchy

For Loops and subqueries 
Query OK (execution time: 15.359 sec; total time: 15.422 sec)
For With Recursive 
Query OK (execution time: 62 ms; total time: 125 ms) 

Both returning 1667 values . You can try a more efficient looping logic than the most easier, but WITH RECURSIVE is made to use recursion. Loops in other hand when you dont need to access multiple level,only want to repeat something withouth going to another conditional repetition is more efficient than Recursion. So both should be used in their proper roles.

  • Related