Home > Blockchain >  Converting recursive CTE from Postgre SQL to SQL Server
Converting recursive CTE from Postgre SQL to SQL Server

Time:06-22

I am having trouble adapting some recursive CTE code from PostgreSQL to SQL Server.

Here is my table:

CREATE TABLE flight (
  src CHAR(3)
, dest CHAR(3)
, stt DATETIME
, endt DATETIME);

INSERT INTO flight VALUES 
('MSP', 'SLC', '2022-10-02 11:45:00', '2022-10-02 14:10:00'),
('SLC', 'LAX', '2022-10-02 15:20:00', '2022-10-02 17:45:00'),
('MSP', 'LAX', '2022-10-02 12:15:00', '2022-10-02 15:05:00')

and what I am trying to adapt:

WITH RECURSIVE flight_paths (src, flights, path, dest, stt, endt) AS (
SELECT
  src
, ARRAY[src || '-' || dest]
, ARRAY[src]
, dest 
, stt
, endt
FROM flight
UNION ALL
SELECT
  fp.src
, fp.flights || (f.src || '-' || f.dest)
, fp.path || f.src
, f.dest
, fp.stt
, f.endt
FROM flight f
JOIN flight_paths fp ON f.src = fp.dest 
WHERE NOT f.src = ANY(fp.path) 
  AND NOT 'LAX' = ANY(fp.path) 
  AND f.stt > fp.endt
) 
SELECT flights, stt, endt, path[2:] stopovers
FROM flight_paths
WHERE src = 'MSP' AND dest = 'LAX'

I have been having issues adapting the use of ARRAYs. Any pointers would be really helpful!

CodePudding user response:

Unfortunately, SQL Server does not support arrays. But you could use a JSON array instead.

A pretty straight-up copy of your original code results in this

WITH flight_paths (src, flights, path, dest, stt, endt) AS (

SELECT
  src
, CAST('["'   src   '-'   dest  '"]' AS nvarchar(max))
, CAST('["'   src   '"]' AS nvarchar(max))
, dest 
, stt
, endt
FROM flight

UNION ALL

SELECT
  fp.src
, JSON_MODIFY(fp.flights, 'append $', f.src   '-'   f.dest)
, JSON_MODIFY(fp.path, 'append $', f.src)
, f.dest
, fp.stt
, f.endt
FROM flight f
JOIN flight_paths fp ON f.src = fp.dest AND f.dest <> fp.src
WHERE EXISTS (SELECT 1
    FROM OPENJSON(fp.path) arr
    WHERE arr.value NOT IN(f.src, 'LAX')
  )
  AND f.stt > fp.endt

) 
SELECT flights, stt, endt, path stopovers
FROM flight_paths
WHERE src = 'MSP'
  AND dest = 'LAX'

Some minor changes in the logic were necessary:

  • It's difficult to delete an element from a JSON array. So instead the path only builds the intermediate nodes, and a separate check is made to exclude the start point.

I note that it's probably faster to check explicitly that the destination has not been reached, and also to push the start point into the anchor part of the CTE:

WITH flight_paths (src, flights, path, dest, stt, endt) AS (

SELECT
  src
, CAST('["'   src   '-'   dest  '"]' AS nvarchar(max))
, CAST('["'   src   '"]' AS nvarchar(max))
, dest 
, stt
, endt
FROM flight
WHERE src = 'MSP'

UNION ALL

SELECT
  fp.src
, JSON_MODIFY(fp.flights, 'append $', f.src   '-'   f.dest)
, JSON_MODIFY(fp.path, 'append $', f.src)
, f.dest
, fp.stt
, f.endt
FROM flight f
JOIN flight_paths fp ON f.src = fp.dest AND f.dest <> fp.src
  AND fp.dest <> 'LAX'
WHERE EXISTS (SELECT 1
    FROM OPENJSON(fp.path) arr
    WHERE arr.value <> f.src
  )
  AND f.stt > fp.endt

) 
SELECT flights, stt, endt, path stopovers
FROM flight_paths
WHERE dest = 'LAX';

db<>fiddle

CodePudding user response:

Much more quicker will be :

WITH flight_paths (src, flights, path, dest, stt, endt) AS (SELECT
  src
, CAST('{"'   src   '-'   dest  '"}' AS nvarchar(max))
, CAST('{"'   src   '"}' AS nvarchar(max))
, dest 
, stt
, endt
FROM flight
WHERE src = 'MSP'

UNION ALL

SELECT
  fp.src
, fp.flights   '{"'  f.src '"}'   '{"'  f.dest '"}'
, fp.path   '{"'  f.src '"}'
, f.dest
, fp.stt
, f.endt
FROM flight f
JOIN flight_paths fp ON f.src = fp.dest AND f.dest <> fp.src
  AND fp.dest <> 'LAX'
WHERE fp.path NOT LIKE '%'   '{"'  f.src '"}'   '%'
  AND f.stt > fp.endt

) 
SELECT flights, stt, endt, path stopovers
FROM flight_paths
WHERE dest = 'LAX';

Tested with 100003 rows, the results are :

  • Charlieface Query 1 : 5' 05"
  • Charlieface Query 2 : 4' 58"
  • My own one : 0' 01"

Running on a Windows 2019 Developper / PC with 256 Go RAM and 24 cores x 2 intel Xeon CPU E5-1650 v4

Conclusion : JSON or XML are very costly in recursive queries...

  • Related