Home > Blockchain >  Is there a better method instead of 2 nested subqueries for this task?
Is there a better method instead of 2 nested subqueries for this task?

Time:12-26

I am trying to write the SQL to generate the count of unique users who have purchased product B but have never purchased product C.

user_id product date_purchased
1 A 2015-01-10 00:00:00.000
1 B 2014-11-23 00:00:00.000
1 C 2015-05-01 00:00:00.000
2 A 2014-10-01 00:00:00.000
2 C 2014-12-23 00:00:00.000
3 B 2015-02-15 00:00:00.000
3 D 2014-09-23 00:00:00.000
3 E 2014-06-01 00:00:00.000
4 E 2014-12-14 00:00:00.000
4 F 2015-03-03 00:00:00.000

I have a solution but its so ugly I bet there is a better way of doing it. Any thoughts on how to make this better in terms of best practices?

SELECT Count(*) AS bought_b_no_c
FROM   (SELECT user_id,
               Sum(bought_b_no_c) AS boolean_b_no_c
        FROM   (SELECT user_id,
                       product,
                       CASE
                         WHEN product = 'B' THEN 1
                         WHEN product = 'C' THEN -1
                         ELSE 0
                       END AS bought_b_no_c
                FROM   table_purchases) AS T
        GROUP  BY user_id) AS J
WHERE  boolean_b_no_c = 1

I am editing since there seem to be few answers that have operations such as SUM(product = 'C') but this returns an error such as: Incorrect syntax near ')' and it highlights the closing parathesis around the SUM function next to C.

CodePudding user response:

You can get all the users who bought 'B' but not 'C' with this query:

SELECT user_id
FROM table_purchases
WHERE product IN ('B', 'C')
GROUP BY user_id
HAVING SUM(product = 'C') = 0;

Then you only need to count its results:

SELECT COUNT(*) AS bought_b_no_c
FROM (
  SELECT user_id
  FROM table_purchases
  WHERE product IN ('B', 'C')
  GROUP BY user_id
  HAVING SUM(product = 'C') = 0
) t;

CodePudding user response:

A simple aggregation will do here:

SELECT user_id
FROM table_purchases
GROUP BY user_id
HAVING SUM(product = 'B') > 0 AND SUM(product = 'C') = 0;

We could also phrase this using exists logic:

SELECT DISTINCT user_id
FROM table_purchases t1
WHERE product = 'B' AND
      NOT EXISTS (SELECT 1 FROM table_purchases t2
                  WHERE t2.user_id = t1.user_id AND
                        t2.product = 'C');

CodePudding user response:

Sorry, I am writing this on my mobile phone so lacking a detailed explanation.

SELECT COUNT(DISTINCT t1.user_id) AS bought_b_no_c
FROM table_purchases t1
LEFT JOIN table_purchases t2 ON t1.user_id = t2.user_id AND t2.product = 'C'
WHERE t2.user_id IS NULL AND t1.product = 'B';
  • Related