I'm currently trying to run the below query.
SELECT DISTINCT friend_id, hobbies as friend_hobby
FROM all_friends
WHERE friend_id NOT IN
(SELECT distinct(friend_id)
FROM all_friends
WHERE (friend_type = 'Good' OR friend_type = 'Great')
);
Expected Result
friend_id | friend_hobby
--------------------
123 | Basketball
--------------------
345 | Painting
Actual Result
friend_id | friend_hobby
--------------------
123 | Basketball
--------------------
123 | Social Media
--------------------
123 | Reading
--------------------
123 | Hiking
--------------------
123 | Cooking
--------------------
345 | Painting
--------------------
345 | Hiking
--------------------
345 | Writing
--------------------
345 | Movies
--------------------
345 | Coding
Can anyone explain why the distinct clause doesn't return my Expected Result?
Is it because I'm also fetching friend_hobby, as well?
For note, I need to select both friend_id and hobbies, because I need to UNION with a Select statement fetching friend_id and hobbies. The hobbies in this select statement doesn't really matter, because they'll be replaced at some point.
What is the solution? I know adding group by at the end of the statement will remove the duplicate friend ids, but I'd prefer not to use group by without an aggregation occurring.
Thanks!
CodePudding user response:
The issue is that DISTINCT is based on the "group" that you define
SELECT DISTINCT friend_id, hobbies as friend_hobby
In the line above, you're saying to give you all distinct pairs of friend_id
and hobbies
(friend_hobby).
However, your expected results seems to be more like: fetch me the first tuple that has a distinct id (if that sentence makes sense at all).
In this case, I believe doing something like the following post should help with what you want: https://stackoverflow.com/a/42345019/5989233
CodePudding user response:
DISTINCT itself doesnt remove duplicated, you need group by
as well
SELECT DISTINCT friend_id, friend_hobby from {whatever} group by friend_id