I have a table with two columns: ProductID and CustomerID
Everytime a specific customer (CustomerID) purchases a specific product (denoted by a specific ProductID), the table records that.
ProductID can appear multiple times, matched with a different customerID (and customers may purchase a product multiple times).
ProductID | CustomerID |
---|---|
1111111 | 14567 |
2222222 | 17890 |
3333333 | 17890 |
4444444 | 17890 |
5555555 | 14567 |
5555555 | 17890 |
Is there a way two pull out the customers (CustomerID) who bought a certain product (ProductID) but did not purchase another product (ProductID).
Currently, I have
select a.CustomerID
from iread a, iread b
where a.CustomerID in ({{ProductID}}) and b.CustomerID not in ({{ProductID}})
and a.CustomerID=b.CustomerID
which just gives me the ProductID back... I think it may have to do with this part:
where a.CustomerID in ({{ProductID}}) and b.CustomerID not in ({{ProductID}})
Any help/solutions welcome! Thank you!!
CodePudding user response:
Using a having clause and case expression to filter your aggregated records would help you to filter across multiple rows. This would also be less expensive than a self-join.
Eg
SELECT
CustomerID
FROM
iread
WHERE
ProductID IN (<included_comma_separated_product_ids_to_be_considered_which_are_desired_and_not_desired>)
GROUP BY
CustomerID
HAVING
SUM(
CASE WHEN ProductID=<include_desired_product_id> THEN 1 ELSE 0 END
) > 0 AND
SUM(
CASE WHEN ProductID=<include_not_desired_product_id> THEN 1 ELSE 0 END
) = 0
Below is an example where we try to find customers who purchased 5555555
but not 1111111
based on the sample data you shared.
Aggregated Results for Debugging Purposes
SELECT
CustomerID,
SUM(
CASE WHEN ProductID=5555555 THEN 1 ELSE 0 END
) as desired_product,
SUM(
CASE WHEN ProductID=1111111 THEN 1 ELSE 0 END
) as undesired_product
FROM
iread
WHERE
ProductID IN (5555555,1111111)
GROUP BY
CustomerID;
CustomerID | desired_product | undesired_product |
---|---|---|
14567 | 1 | 1 |
17890 | 1 | 0 |
Using Having and Case expressions to retrieve desired customers
SELECT
CustomerID
FROM
iread
WHERE
ProductID IN (5555555,1111111)
GROUP BY
CustomerID
HAVING
SUM(
CASE WHEN ProductID=5555555 THEN 1 ELSE 0 END
) > 0 AND
SUM(
CASE WHEN ProductID=1111111 THEN 1 ELSE 0 END
) = 0;
CustomerID |
---|
17890 |
View working demo on DB Fiddle
Edit 1: I have modified the examples above to include a where clause. This should take advantage of indexes on the columns used in the where clause if indexes are available on the table.
Let me know if this works for you.