Home > front end >  Filter query to make sure that grouped by column has that value at least one time?
Filter query to make sure that grouped by column has that value at least one time?

Time:01-05

What is the proper method to filter for a value where you want one of the rows for that grouped by column to have that value at least once, but not limit it to that value?

Customer_ID |   Item
   ab123       Cookie
   ab123       Juice
   ab123       Towel
   bd345       Juice 
   bd345       Coffee
   VG555       Cookie
   VG555       Cookie
   VG555       Water 

Where I want all customer ids that have at least one item that is a cookie?

CodePudding user response:

An EXISTS is sometimes useful for this.

SELECT Customer_ID
FROM customers cust
WHERE EXISTS (
  SELECT 1
  FROM purchases purch
  WHERE purch.Customer_ID = cust.Customer_ID
    AND purch.Item = 'Cookie' 
);

But since it's just about customers with cookies.
A DISTINCT or GROUP BY could be enough.

SELECT Customer_ID, COUNT(Item) AS Cookies
FROM purchases
WHERE Item = 'Cookie'
GROUP BY Customer_ID

CodePudding user response:

You want all customer IDs that have at least one item that is a cookie. For this you'd simply select the customer IDs from the rows where the item equals 'Cookie'. In order to get rid of duplicates, you'd use DISTINCT:

select distinct customer_id from mytable where item = 'Cookie'.

With a big table and many duplicates, this can take long though. Image 1000 customers with 1000 cookies each. This would be a million rows to sort and aggregate in order to get the 1000 result rows. In that case a mere lookup might be more appropriate:

select customer_id
from customer
where customer_id in (select customer_id from mytable where item = 'Cookie');

Thus the DBMS can stop reading a customer's rows once it has found the first cookie for them.

Update: snowflake doesn't support indexes. Try both queries hence and see which one gets executed faster.

Both queries would benefit from this index:
create index idx on mytable (item, customer_id);

CodePudding user response:

Using QUALIFY and COUNT_IF:

SELECT *
FROM Tab
QUALIFY COUNT_IF(Item='Cookie') OVER(PARTITION BY Customer_ID) > 0

Alternatively using HAVING:

SELECT Customer_id
FROM tab
GROUP BY Customer_Id
HAVING COUNT_IF(Item='Cookie') > 0;
  •  Tags:  
  • Related