Get N most frequently bought combination product in one transaction using MSSQL-CodePudding

I have a data set like below:

transaction_id	store_id	product_id	product_name
1	100	p001	product_1
1	100	p002	product_2
1	100	p003	product_3
4	100	p002	product_2
4	100	p003	product_3
5	100	p002	product_2
5	100	p003	product_3
7	100	p001	product_1
7	100	p003	product_3
8	101	p002	product_2
8	101	p003	product_3
9	101	p001	product_1
9	101	p002	product_2
2	101	p001	product_1
3	101	p002	product_2
3	101	p003	product_3
6	101	p001	product_1
6	101	p002	product_2

I am trying to find a query to give output as below.

store	freq_prod_ids	count_of_transactions
100	p002, p003	3
100	p001, p003	2
101	p001, p002	2
101	p002, p003	2

This essentially should give the top 2 [N=2] frequently bought product combinations in a single transaction for each store.

Please help to have an SQL query to get this response.

CodePudding user response：

You may try the following which performs a self-join before aggregating based on store and product pairs. The row_number is used to retrieve the top 2 product pairs per store.

SELECT
    store_id, freq_prod_ids,count_of_transactions
FROM (
    SELECT
        t1.store_id,
        CONCAT(t1.product_id,', ',t2.product_id) as freq_prod_ids,
        COUNT(1) as count_of_transactions,
        ROW_NUMBER() OVER (PARTITION BY t1.store_id ORDER BY COUNT(1) DESC) as rn
    FROM my_table t1 
    INNER JOIN my_table t2 on t1.store_id = t2.store_id and 
                              t1.product_id < t2.product_id and
                              t1.transaction_id = t2.transaction_id
    GROUP BY t1.store_id,CONCAT(t1.product_id,', ',t2.product_id)
) t3 WHERE rn <=2

View working demo db fiddle

Let me know if this works for you.