I have the following data:
id1 | id2 | score |
---|---|---|
281 | 33453 | 23 |
281 | 33452 | 23 |
281 | 36466 | 24 |
282 | 12118 | 14 |
282 | 27603 | 18 |
How do I write my query in PostgreSQL in the most efficient way so that I keep all id2 values belonging to id1 where the score is the lowest (for every id1)? Or in other words, I want to have following result:
id1 | id2 | min_score |
---|---|---|
281 | 33453 | 23 |
281 | 33452 | 23 |
282 | 12118 | 14 |
I'm using following query so far:
SELECT m1.id1, m1.id2, m2.min_score
FROM test m1
INNER JOIN (
SELECT id1, MIN(score) as min_score
FROM test
GROUP BY id1
) m2
ON (m1.id1 = m2.id1 AND m1.score = m2.min_score)
ORDER BY m1.id1, m2.min_score
Is there a faster way?
Thanks in advance!
CodePudding user response:
After reading the question again I will edit it to the alt method to do it. Depending on how your database runs it operations this might be slightly faster or slower.
SELECT distinct a.id1, a.id2, a.minscore
FROM test a,
(SELECT id1, min(score) as minscore
FROM test
group by id1
) AS b
WHERE a.id= b.id
might return results faster edit log: fixed typo
CodePudding user response:
You can write this with a window function rather than a join:
select id1, id2, score as min_score from (
select *, rank() over (partition by id1 order by score) as rank
from test
) foo
where rank=1
ORDER BY id1, min_score;
As for which method will be faster, try it and see. If you want help interpreting the results, show an EXPLAIN (ANALYZE, BUFFERS) of it done each way. You should probably choose between them based on how easy it is for you to understand them, not based on speed. Some people intuitively grasp window functions, and some people don't.