Home > Mobile >  How can I use Recursive CTE to find all products that were purchased together?
How can I use Recursive CTE to find all products that were purchased together?

Time:04-10

I'm working on a problem that requires to find all possible items that were purchased together in past matching the order at hand. The store has a large selection of items, and the customer could purchase any number of items together (i.e. 1 to N products). I've to find all items that were purchased together (including items purchased alone) in past which matches the current order in terms of items and qty ordered. If exact quantities can't be found then find the closest group.

DECLARE @SampleData as TABLE (OrderID int, Item varchar(20) , PurchaseQty int)

INSERT INTO @SampleData VALUES (1       ,'Shirt'      ,2)
INSERT INTO @SampleData VALUES (1       ,'Shoes'      ,1)
INSERT INTO @SampleData VALUES (2       ,'Shirt'      ,4)
INSERT INTO @SampleData VALUES (2       ,'Pant'       ,1)
INSERT INTO @SampleData VALUES (2       ,'Shoes'      ,1)
INSERT INTO @SampleData VALUES (3      ,'T-Shirt'     ,3)
INSERT INTO @SampleData VALUES (3      ,'Short'       ,2)
INSERT INTO @SampleData VALUES (4      ,'Shirt'       ,1)
INSERT INTO @SampleData VALUES (5      ,'Shoes'       ,3)
INSERT INTO @SampleData VALUES (5      ,'Shirt'       ,1)
INSERT INTO @SampleData VALUES (5      ,'Pant'        ,1)
INSERT INTO @SampleData VALUES (5      ,'Tie'         ,1)

e.g. if the current order is for (2 shirt, 1 shoes), I'like to find all orders that have either of these or both in it. So from above data, the result should be like OrderId 1,2,5 and 4 since they have either or both of these items. Notice the sequence of orders is to match the exact and then the next closet.

I've tried following recursive CTE, but it is entering endless recursion and crashing:

;WITH temp AS
(
   SELECT sd.OrderID, sd.Item, sd.PurchaseQty
   FROM @SampleData sd 
   WHERE sd.Item in ('shirt','short')
   UNION ALL
   SELECT sd.OrderID, t.Item, sd.PurchaseQty
   FROM @SampleData sd
   INNER JOIN temp t ON t.item = sd.Item
)
SELECT * FROM temp t 
Order By t.PurchaseQty

Althogh in this case, I've selected (shirt,shoes) as a possible order, customer can order any number of items in any qty, e.g. an order of 35 distinct items. I've to find either an order having the exact match in terms of items and qty, or the closet match. e.g. if it can't find an order with 2 shirts and 1 shoe, find an order with 1 shirt and 1 shoe etc.

Is recursive CTE the correct type of query for such a problem, if yes, how can I achieve it.

CodePudding user response:

No, you don't need a recursive query. A simple self-join with EXISTS will work:

select * from @SampleData s
where exists
( select * from @SampleData q
  where q.OrderId = s.OrderID
    and q.Item in ('Shirt','Shoes') )

If the number of items in the current order is large, you may want to put the current order in a temp table ##CurrentOrder and join to it in the inner query:

select * from @SampleData s
where exists
( select * from @SampleData q
  where q.OrderId = s.OrderID
    and q.Item in (select Item from ##CurrentOrder) )

CodePudding user response:

You can use STRING_AGG with the Distinct values. the SElection of the order happens in the subselect, as STRINg_AGG doesn't support DISTINCT

CREATE tABLE  SampleData  (OrderID int, Item varchar(20) , PurchaseQty int)

INSERT INTO SampleData VALUES (1       ,'Shirt'      ,2)
INSERT INTO SampleData VALUES (1       ,'Shoes'      ,1)
INSERT INTO SampleData VALUES (2       ,'Shirt'      ,4)
INSERT INTO SampleData VALUES (2       ,'Pant'       ,1)
INSERT INTO SampleData VALUES (2       ,'Shoes'      ,1)
INSERT INTO SampleData VALUES (3      ,'T-Shirt'     ,3)
INSERT INTO SampleData VALUES (3      ,'Short'       ,2)
INSERT INTO SampleData VALUES (4      ,'Shirt'       ,1)
INSERT INTO SampleData VALUES (5      ,'Shoes'       ,1)
INSERT INTO SampleData VALUES (5      ,'Pant'        ,1)
INSERT INTO SampleData VALUES (5      ,'Tie'         ,1)
GO
SELECT 
STRING_AGG(orderID,',')
WITHIN GROUP ( ORDER BY OrderID ASC )
FROM
(SELECT DISTINCT orderID   FROM SampleData WHERE Item IN ('shirt', 'shoes')) t1
GO
| (No column name) |
| :--------------- |
| 1,2,4,5          |

db<>fiddle here

  • Related