Suppose we have an 'order' table consisting of two fields - customer and product purchased by the customer.
CREATE TABLE orders (customer VARCHAR(16) NOT NULL, product_id INT NOT NULL);
If similar customers are defined as those who purchased at least two same products. I'm wondering how I can get products purchased by similar customers of each customer by a query. (Note that the products purchased by the customer itself should be excluded).
The input:
INSERT INTO orders VALUES
('A', 1), ('A', 2), ('A', 3),
('B', 1), ('B', 2), ('B', 4),
('C', 1), ('C', 3), ('C', 4), ('C', 5);
Ideal output:
|customer|product_id|
|---|---|
|'A'| 4 |
|'A'| 5 |
|'B'| 3 |
|'B'| 5 |
|'C'| 2 |
For example, customer A and customer B both bought product 1 and product 2, so they are similar customers. Therefore, product 4 is a product bought by a similar customer of customer A, so is it listed. I could get similar customers by the following code but don't know how to proceed:
WITH common AS (
SELECT o1.customer AS cust_1, o2.customer AS cust_2, o1.product_id AS prod_id,
COUNT(*) OVER (PARTITION BY o1.customer, o2.customer) AS same_purchased
FROM orders o1 JOIN orders o2
ON (o1.customer < o2.customer AND o1.product_id = o2.product_id))
SELECT cust_1, cust_2, prod_id
FROM common WHERE same_purchased >= 2
SQL fiddle: https://dbfiddle.uk/S6X7gCJH
CodePudding user response:
Here the SQL self-explanatory:
WITH similars as (
SELECT
c1.customer,
c2.customer as similar
from
orders c1
inner join orders c2 on c1.customer != c2.customer
and c1.product_id = c2.product_id
group by
c1.customer,
c2.customer
having
count(*) > 1
)
select
distinct s.customer,
o.product_id
from
similars s
inner join orders o on o.customer = s.similar
where
o.product_id not in (
select
product_id
from
orders
where
customer = s.customer
)
order by
s.customer,
o.product_id
Result
mer | product_id |
---|---|
A | 4 |
A | 5 |
B | 3 |
B | 5 |
C | 2 |