Home > Enterprise >  Remove rows from recursive query
Remove rows from recursive query

Time:12-05

When I run this query in my database I get the following:

WITH RECURSIVE series AS (
    SELECT CONCAT( a.title) as str, a.prequelID
    FROM ( Prequels NATURAL JOIN Books) AS a
    UNION
    SELECT CONCAT(t.title, ' -> ', str) as str, t.prequelID
FROM (Books NATURAL JOIN Prequels) as t 
INNER JOIN series AS s ON s.prequelID = t.bookID
)
SELECT str as series FROM series
ORDER BY series;

This is the result :

enter image description here

I don't want any duplicates, only the full string that shows the whole series. How do I do this?

UPDATE:

I updated the query because I realised I was missing the first book in the series.

WITH RECURSIVE series AS (
    SELECT CONCAT( a.title) as str, a.prequelID
    FROM ( Prequels NATURAL JOIN Books) AS a
    UNION
    SELECT CONCAT(t.title, ' -> ', str) as str, t.prequelID
FROM (Books NATURAL JOIN Prequels) as t 
INNER JOIN series AS s ON s.prequelID = t.bookID
)
SELECT CONCAT(a.title, ' -> ',a.str) as series
FROM (
SELECT Books.title, series.str
FROM series JOIN Books ON series.prequelID = Books.bookID
WHERE NOT EXISTS(
        SELECT prequelID
        FROM Prequels
        WHERE series.prequelID = Prequels.bookID
) 
) a
ORDER BY series;

The result is still a bit off, since I only want the complete string of the series: enter image description here

How do I fix this?

Tables:

CREATE TABLE Books
(bookID integer PRIMARY KEY,
title varchar(100),
pages integer);


CREATE TABLE Prequels
(bookID INTEGER REFERENCES Books(bookID),
prequelID INTEGER REFERENCES Books(bookID),
PRIMARY KEY (bookID,prequelID));

Sample data of the Game of Thornes series:

INSERT INTO BOOKS (bookID,title,pages) VALUES (80429,'A Game of Thrones',292);
INSERT INTO BOOKS (bookID,title,pages) VALUES (41121,'A Clash of Kings',160);
INSERT INTO BOOKS (bookID,title,pages) VALUES (29287,'A Storm of Swords',160);
INSERT INTO BOOKS (bookID,title,pages) VALUES (17696,'A Feast for Crows',292);
INSERT INTO BOOKS (bookID,title,pages) VALUES (3947,'A Dance with Dragons',101);


INSERT INTO Prequels (bookID,prequelID) VALUES (41121,80429);
INSERT INTO Prequels (bookID,prequelID) VALUES (29287,41121);
INSERT INTO Prequels (bookID,prequelID) VALUES (17696,29287);
INSERT INTO Prequels (bookID,prequelID) VALUES (3947,17696);

CodePudding user response:

You'd want to start from the roots of the trees.
There are no roots in the prequelId's.

WITH RECURSIVE series AS (
    SELECT 
      b.bookId
    , 1 as lvl
    , p.prequelID
    , CONCAT(b.title) as series
    FROM Books AS b
    LEFT JOIN Prequels AS p 
      ON p.bookId = b.bookId
    WHERE NOT EXISTS (
      SELECT 1
      FROM Prequels p2
      WHERE p2.prequelID = b.bookId
    )
    
    UNION ALL
    
    SELECT 
      s.bookId
    , s.lvl 1
    , p.prequelID
    , CONCAT(b.title, ' -> ', s.series)
    FROM series AS s
    JOIN Books AS b 
      ON b.bookId = s.prequelID
    LEFT JOIN Prequels AS p 
      ON p.bookId = s.prequelID
)
SELECT series
FROM series
WHERE prequelId is null
  AND lvl > 1
ORDER BY series;
series
A Game of Thrones -> A Clash of Kings -> A Storm of Swords -> A Feast for Crows -> A Dance with Dragons

Demo on db<>fiddle here

CodePudding user response:

Modified version of LukStorms now deleted answer...

https://dbfiddle.uk/?rdbms=postgres_14&fiddle=4e26beed430aefb60f1dc91766db13d3

WITH RECURSIVE series AS (
   SELECT
      b.bookId,
      1 as lvl,
      p.prequelID,
      CONCAT(b.title)   AS series
    FROM
      books      b
    LEFT JOIN
      prequels   p
        ON p.bookid = b.bookid
    WHERE NOT EXISTS (
      SELECT 1
      FROM Prequels p
      WHERE p.prequelID = b.bookId
    )

    UNION ALL

    SELECT
      b.bookid,
      lvl 1,
      p.prequelID,
      CONCAT(b.title, ' -> ', s.series)
    FROM
      series     s
    INNER JOIN
      books      b
        ON b.bookid = s.prequelid
    LEFT JOIN
      prequels   p
        ON p.bookid = b.bookid
)
SELECT
  s.series
FROM
  series s
WHERE
  s.prequelid IS NULL
ORDER BY
  s.series;

The non recuraive part of the CTE has a LEFT JOIN so to include books that are not part of a series, just a standalone book. Revert that to NATURAL or INNER JOIN if you don't want that.

The non recursive part of the CTE has a WHERE clause to ensure it only ever starts from the last book in a series.

Thr recursive part of the CTE has a LEFT JOIN on the prequels table, so that the first book in a series isn't lost (due to not having a prequel row), and so records that book's prequel as NULL.

The outer query then looks for that NULL, so only outputting rows for completed series.

CodePudding user response:

After the recursive query, you can compare all the str together with a self join, and keep the s1.str which are not a substring of any s2.str thanks to the WHERE s2.str IS NULL clause.

WITH RECURSIVE series AS (
    SELECT CONCAT( a.title) as str, a.prequelID
    FROM ( Prequels NATURAL JOIN Books) AS a
    UNION
    SELECT CONCAT(t.title, ' -> ', str) as str, t.prequelID
FROM (Books NATURAL JOIN Prequels) as t 
INNER JOIN series AS s ON s.prequelID = t.bookID
)
SELECT s1.str as series 
  FROM series AS s1
  LEFT JOIN series AS s2
    ON s2.str ~ s1.str
   AND s2.str <> s1.str
 WHERE s2.str IS NULL
ORDER BY series;

test result in dbfiddle

  • Related