Home > Blockchain >  Implement SQL query with conditional sum and how does it work?
Implement SQL query with conditional sum and how does it work?

Time:12-28

I am trying to write the SQL to generate the count of unique users who have purchased product B but have never purchased product C.

user_id product date_purchased
1 A 2015-01-10 00:00:00.000
1 B 2014-11-23 00:00:00.000
1 C 2015-05-01 00:00:00.000
2 A 2014-10-01 00:00:00.000
2 C 2014-12-23 00:00:00.000
3 B 2015-02-15 00:00:00.000
3 D 2014-09-23 00:00:00.000
3 E 2014-06-01 00:00:00.000
4 E 2014-12-14 00:00:00.000
4 F 2015-03-03 00:00:00.000

Someone suggested trying the following

SELECT COUNT(*) AS bought_b_no_c
FROM (
  SELECT user_id
  FROM table_purchases
  WHERE product IN ('B', 'C')
  GROUP BY user_id
  HAVING SUM(product = 'C') = 0
) t;

However, SQL Server is saying I have an error as follows:

Msg 102, Level 15, State 1, Line 5 Incorrect syntax near ')'.

Anybody know why this is happening and how to learn exactly how this work?

PS This was my original code

SELECT Count(*) AS bought_b_no_c
FROM   (SELECT user_id,
               Sum(bought_b_no_c) AS boolean_b_no_c
        FROM   (SELECT user_id,
                       product,
                       CASE
                         WHEN product = 'B' THEN 1
                         WHEN product = 'C' THEN -1
                         ELSE 0
                       END AS bought_b_no_c
                FROM   table_purchases) AS T
        GROUP  BY user_id) AS J
WHERE  boolean_b_no_c = 1

CodePudding user response:

You could use this:

SELECT COUNT(b.*) AS bought_b_no_c
FROM table_purchases b
WHERE b.product = 'B'
AND b.user_id NOT IN (
  SELECT c.user_id
  FROM table_purchases c
  WHERE c.product = 'C'
)

This one "reads" more natural to me.


The other solution someone suggested can work with the fix from Dale K:

SELECT COUNT(*) AS bought_b_no_c
FROM (
  SELECT user_id
  FROM table_purchases
  WHERE product IN ('B', 'C')
  GROUP BY user_id
  HAVING SUM(case when product = 'C' then 1 else 0 end) = 0
) t;

I'm not sure which one has better performance, but both should do the trick.

CodePudding user response:

Here is a way using NOT EXISTS. I translated the question "get unique users who use product B and has not purchased product C"

select count(distinct x.user) /*gets unique users*/
  from table x
 where x.product='B' /*who purchased product B*/
   and not exists(select null
                    from table y
                   where y.user=x.user
                     and x.product = 'C' /*and the user has not purchased C */)
  • Related