I would like to use a division query to have all the customers who bought all products.
The template is the following :
SELECT *
FROM Customers AS A
WHERE NOT EXISTS
(
SELECT *
FROM Products AS B
WHERE NOT EXISTS
(
SELECT *
FROM Purchases AS C
WHERE C.CustomerID= A.CustomerID
AND C.ProductID= B.ProductID
));
It's well explained in this topic: https://stackoverflow.com/a/71877187/17783040
My issue is I don't have 3 tables to join but 4. The division query is supposed to work the same but I can't succeed it.
So my 4 tables:
- Customers: PK is CustomerID
- Products: PK is ProductID
- Invoices: PK is InvoiceID and FK is CustomerID
- InvoicesLines: FK are InvoiceID and ProductID
I would like to practice this kind of query even if it's an hard way. I really understood the division query working for 3 tables in a many to many relationship.
But in my case, I need a 4th table to join CustomerID to ProductID because the intermediate table as InvoiceLine collects each line of the invoice for a particular ProductID and quatity purchased ( InvoiceID). Then the Invoice table contains only the InvoiceID and CustomerID.
For example with data:
Customers Table:
CustomerID name surname address
1 Charles Smith 123 main street
2 Henry Johnson 546 Drive
3 Jennifer Davis 65 Avenue
Products Table:
ProductID name quantity
1 pc 12
2 usb 56
3 tv 67
Invoices Table:
InvoiceID CustomerID
1 1
2 3
3 2
4 3
5 1
InvoiceLines Table:
InvoicelineID ProductID description InvoiceID
1 2 blablabla 1
2 1 blablabla 2
3 2 blablabla 2
4 1 blablabla 2
5 2 blablabla 3
6 1 blablabla 3
7 3 blablabla 4
8 3 blablabla 5
So it's the CustomerID #3 who have bought all the products (1,2 and 3) that we can see in the InvoiceLines table for the IncoiceID #2 and #4
This following query returns all the customers in the database without any restriction and not only the one who purchased all the products in catalogue:
SELECT *
FROM Customers AS C
WHERE NOT EXISTS
(
SELECT *
FROM Products AS P
WHERE NOT EXISTS
(
SELECT *
FROM Invoices AS I
WHERE NOT EXISTS
(
SELECT *
FROM InvoiceLines AS L
WHERE I.CustomerID= C.CustomerID
AND L.InvoiceID= I.InvoiceID
AND L.ProductID= P.ProductID
)));
It's other try is not working either:
SELECT *
FROM Customers AS C
WHERE NOT EXISTS
(
SELECT *
FROM Products AS P
WHERE NOT EXISTS
(
SELECT *
FROM
(
SELECT * FROM InvoiceLines AS L, Invoices AS I
WHERE L.InvoiceID= I.InvoiceID
) AS Pu
WHERE Pu.CustomerID= C.CustomerID
AND Pu.ProductID= P.ProductID
)
);
Do you have any hint or advice to make it work using the division query template?
CodePudding user response:
The logic in your last query is correct, only that you need to replace select *
with select l.ProductID, i.CustomerID
(otherwise you will get "duplicate column name" errors):
SELECT *
FROM Customers AS C
WHERE NOT EXISTS
(
SELECT *
FROM Products AS P
WHERE NOT EXISTS
(
SELECT *
FROM
(
SELECT L.ProductID, I.CustomerID
FROM InvoiceLines AS L
JOIN Invoices AS I ON L.InvoiceID = I.InvoiceID
) AS Pu
WHERE Pu.CustomerID = C.CustomerID
AND Pu.ProductID = P.ProductID
)
)
CodePudding user response:
The 2 main columns of interest for solving the question are CustomerID and ProductID which are in Invoices and InvoiceLines, the other tables are of interest only to return additional columns once you get the result, which is the set of I.CustomerId for which the count(distinct L.ProductID) is equal to the count(P.ProductID).