Home > Mobile >  Self-join to find Customers IDs not in Item (Purchase)
Self-join to find Customers IDs not in Item (Purchase)

Time:10-20

I have a table with two columns: ProductID and CustomerID

Everytime a specific customer (CustomerID) purchases a specific product (denoted by a specific ProductID), the table records that.

ProductID can appear multiple times, matched with a different customerID (and customers may purchase a product multiple times).

ProductID CustomerID
1111111 14567
2222222 17890
3333333 17890
4444444 17890
5555555 14567
5555555 17890

Is there a way two pull out the customers (CustomerID) who bought a certain product (ProductID) but did not purchase another product (ProductID).

Currently, I have

select a.CustomerID
from iread a, iread b
where a.CustomerID in ({{ProductID}}) and b.CustomerID not in ({{ProductID}})
and a.CustomerID=b.CustomerID

which just gives me the ProductID back... I think it may have to do with this part:

where a.CustomerID in ({{ProductID}}) and b.CustomerID not in ({{ProductID}})

Any help/solutions welcome! Thank you!!

CodePudding user response:

Using a having clause and case expression to filter your aggregated records would help you to filter across multiple rows. This would also be less expensive than a self-join.

Eg

SELECT
    CustomerID
FROM
    iread
WHERE 
    ProductID IN (<included_comma_separated_product_ids_to_be_considered_which_are_desired_and_not_desired>)
GROUP BY
    CustomerID
HAVING
    SUM(
        CASE WHEN ProductID=<include_desired_product_id> THEN 1 ELSE 0 END
    ) > 0 AND
    SUM(
        CASE WHEN ProductID=<include_not_desired_product_id> THEN 1 ELSE 0 END
    ) = 0

Below is an example where we try to find customers who purchased 5555555 but not 1111111 based on the sample data you shared.

Aggregated Results for Debugging Purposes

SELECT
    CustomerID,
    SUM(
        CASE WHEN ProductID=5555555 THEN 1 ELSE 0 END
    ) as desired_product,
    SUM(
        CASE WHEN ProductID=1111111 THEN 1 ELSE 0 END
    ) as undesired_product
FROM
    iread
WHERE 
    ProductID IN (5555555,1111111)
GROUP BY
    CustomerID;
CustomerID desired_product undesired_product
14567 1 1
17890 1 0

Using Having and Case expressions to retrieve desired customers

SELECT
    CustomerID
FROM
    iread
WHERE 
    ProductID IN (5555555,1111111)
GROUP BY
    CustomerID
HAVING
    SUM(
        CASE WHEN ProductID=5555555 THEN 1 ELSE 0 END
    ) > 0 AND
    SUM(
        CASE WHEN ProductID=1111111 THEN 1 ELSE 0 END
    ) = 0;
CustomerID
17890

View working demo on DB Fiddle

Edit 1: I have modified the examples above to include a where clause. This should take advantage of indexes on the columns used in the where clause if indexes are available on the table.

Let me know if this works for you.

  • Related