here's an example of what I'm looking for:
user_id | color |
---|---|
1 | red |
1 | yellow |
1 | blue |
2 | red |
3 | red |
4 | red |
I want to pull users who DO NOT have the color red or yellow AT ALL. Which I know is not simply:
select user_id
from table_name
where color not in ('red', 'yellow')
User 1 will still output even though they have a line item for both red and yellow. The single line item of color = blue for user 1 fits the criteria of color not in ('red', 'yellow')
, but I want to exclude users that have any line item of red or yellow.
I think something like
with CTE
(of all users that have bought red or yellow)
select user_id
from table_name
where user_id not in (select * from CTE)
will work... but is that best practice? I'm wondering if there's some way easier solution or function out there
CodePudding user response:
With only this one table, we'd use aggregation. E.g.:
select user_id
from table_name
group by user_id
having max(case when color = 'red' then 1 else 0 end) = 0
and max(case when color = 'yellow' then 1 else 0 end) = 0);
With a separate user table (which I suppose exists, as there is a user_id in the table) we would typically use a lookup with NOT EXISTS
or NOT IN
instead. E.g.:
select user_id
from users
where user_id not in (select user_id from table_name where color = 'red')
and user_id not in (select user_id from table_name where color = 'yellow');
CodePudding user response:
This should work for your expected result set:
select distinct user_id, colour
from table_name
where user_id not in (select user_id
from table_name
where colour in ('red', 'yellow'))
You can check an example solution here: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=775bfc1511fede58b469e4da6b1779d1
CodePudding user response:
Following your idea to use IN
, you could do following:
SELECT user_id
FROM table_name
WHERE user_id NOT IN (SELECT user_id FROM table_name
WHERE color IN ('red', 'yellow'));
You could also use NOT EXISTS
which could maybe be faster, but it's less readable. And you could also add a DISTINCT
if you want to make sure to do not select identic user id's multiple times. The query with DISTINCT
would be slower, so it should only be used if really necessary.
CodePudding user response:
I suggest NOT EXISTS it have better performance that NOT IN
SELECT t1.user_id
FROM table_name t1
WHERE not exists (SELECT user_id FROM table_name t2 WHERE color IN ('red', 'yellow') and t1.user_id=t2.user_id);