Home > Blockchain >  Get N most frequently bought combination product in one transaction using MSSQL
Get N most frequently bought combination product in one transaction using MSSQL

Time:10-21

I have a data set like below:

transaction_id store_id product_id product_name
1 100 p001 product_1
1 100 p002 product_2
1 100 p003 product_3
4 100 p002 product_2
4 100 p003 product_3
5 100 p002 product_2
5 100 p003 product_3
7 100 p001 product_1
7 100 p003 product_3
8 101 p002 product_2
8 101 p003 product_3
9 101 p001 product_1
9 101 p002 product_2
2 101 p001 product_1
3 101 p002 product_2
3 101 p003 product_3
6 101 p001 product_1
6 101 p002 product_2

I am trying to find a query to give output as below.

store freq_prod_ids count_of_transactions
100 p002, p003 3
100 p001, p003 2
101 p001, p002 2
101 p002, p003 2

This essentially should give the top 2 [N=2] frequently bought product combinations in a single transaction for each store.

Please help to have an SQL query to get this response.

CodePudding user response:

You may try the following which performs a self-join before aggregating based on store and product pairs. The row_number is used to retrieve the top 2 product pairs per store.

SELECT
    store_id, freq_prod_ids,count_of_transactions
FROM (
    SELECT
        t1.store_id,
        CONCAT(t1.product_id,', ',t2.product_id) as freq_prod_ids,
        COUNT(1) as count_of_transactions,
        ROW_NUMBER() OVER (PARTITION BY t1.store_id ORDER BY COUNT(1) DESC) as rn
    FROM my_table t1 
    INNER JOIN my_table t2 on t1.store_id = t2.store_id and 
                              t1.product_id < t2.product_id and
                              t1.transaction_id = t2.transaction_id
    GROUP BY t1.store_id,CONCAT(t1.product_id,', ',t2.product_id)
) t3 WHERE rn <=2

View working demo db fiddle

Let me know if this works for you.

  • Related