Home > Software design >  COUNT all values in a column with JOIN
COUNT all values in a column with JOIN

Time:06-13

I am joining three tables and need to return two separate counts, one showing the total number of unique users who have purchased an item, and the other showing the total number of unique users who haven't purchased an item. These are cropped for brevity, but here are the relevant tables:

user table
 ---------- ------ ------ ----- 
| username | colb | colc | etc |
 ---------- ------ ------ ----- 
| user1    | *    | *    | *   |
| user2    | *    | *    | *   |
| user3    | *    | *    | *   |
 ---------- ------ ------ ----- 
purchase table
 ------------ --------- ---------- ------ 
| purchaseID | storeID | username | cost |
 ------------ --------- ---------- ------ 
|          1 |       1 | user1    | *    |
|          2 |       1 | user2    | *    |
|          3 |       5 | user2    | *    |
|          4 |       3 | user1    | *    |
 ------------ --------- ---------- ------ 
store table
 --------- ----------- ----- 
| storeID | storeName | etc |
 --------- ----------- ----- 
|       1 | store1    | *   |
|       2 | store2    | *   |
|       3 | store3    | *   |
 --------- ----------- ----- 

I am currently using this query to get the unique users who have purchased an item from a store:

SELECT 
    store.storeID storeID, 
    store.storeName storeName, 
    COUNT(DISTINCT CASE WHEN purchase.username IS NOT NULL
                        THEN purchase.purchaseID END) AS purchases
    [Query to retrieve total unique users who have not purchased an item]
FROM store 
LEFT JOIN purchase
    ON store.storeID = purchase.storeID
LEFT JOIN user
    ON purchase.username = user.username
GROUP BY 1, 2

I have tried a few different ways, none of which have worked. The issue I've identified is when the LEFT JOIN happens it only returns the matching results for usernames, thus the COUNT won't include the other users in the user table. I have not had any luck finding a way to fix this, so I'm hoping someone on here can lend me a hand. The results I'm hoping to see should be something like this:

 --------- ----------- ----------- -------------- 
| storeID | storeName | purchases | nonPurchases |
 --------- ----------- ----------- -------------- 
|       1 | store1    |         2 |            1 |
|       2 | store2    |         0 |            3 |
|       3 | store3    |         1 |            2 |
 --------- ----------- ----------- -------------- 

CodePudding user response:

that is actually quite simple.

First you count all user and subtract te count of distinct purchasers

SELECT 
    store.storeID storeID, 
    store.storeName storeName, 
    COUNT(DISTINCT CASE WHEN purchase.username IS NOT NULL
                        THEN purchase.purchaseID END) AS purchases,
   (SELECT COUNT(*) FROM User) - COUNT(DISTINCT CASE WHEN purchase.username IS NOT NULL
                        THEN purchase.purchaseID END) AS NON_purchases
FROM store 
LEFT JOIN purchase
    ON store.storeID = purchase.storeID
LEFT JOIN user
    ON purchase.username = user.username
GROUP BY 1, 2

CodePudding user response:

I'll go with a slightly different approach.

Generate a combination of store and user using CROSS JOIN, make it as a subquery then use that to LEFT JOIN with purchase table. In SELECT, change COUNT(DISTINCT ..) to SUM(..). Something like this:

SELECT us.storeID,
       us.storeName,
       SUM(CASE WHEN p.username IS NOT NULL
                        THEN 1 ELSE 0 END) AS purchases,
       SUM(CASE WHEN p.username IS NULL
                        THEN 1 ELSE 0 END) AS nonPurchases
FROM (SELECT storeID, storeName, username FROM user u CROSS JOIN store s) us 
LEFT JOIN purchase p
    ON us.storeID = p.storeID
    AND us.username=p.username
GROUP BY 1, 2;

Demo fiddle

  • Related