I have a table with data like following, want to return those group_id with unique data. Both group_id 3 and 4 have two component 123 and 456, so they are "duplicated", we just need to return the smaller group_id, that's 3. Also group_id 5 doesn't have a duplication, it can be returned. So we want group_id 3 and 5 to be returned.
How can I write a SQL query against postgres database to achieve that? Thank you!
id | group_id | component_id |
---|---|---|
1 | 3 | 123 |
2 | 3 | 456 |
3 | 4 | 123 |
4 | 4 | 456 |
5 | 5 | 123 |
CodePudding user response:
SELECT group_id, MIN(component_id)
FROM MyTable
GROUP BY group_id
HAVING COUNT(*) > 1
CodePudding user response:
How about counting how many there are per group_id.
Then take the first, with a priority for those with less components per group_id.
WITH CTE AS (
SELECT group_id, component_id
, COUNT(*) OVER (PARTITION BY group_id) group_count
FROM YourTable
)
SELECT component_id, group_id
FROM
(
SELECT component_id, group_id
, ROW_NUMBER() OVER (PARTITION BY component_id ORDER BY group_count ASC, group_id ASC) AS component_rn
FROM CTE
) q
WHERE component_rn = 1
component_id | group_id |
---|---|
123 | 5 |
456 | 3 |
db<>fiddle here