Home > front end >  Getting top 3 rows for each ID in a list
Getting top 3 rows for each ID in a list

Time:08-06

I have this SQL query to get related comments for each ID in a list:

SELECT comments.id, comments.body, comments.created_at
     , comments.like_count, comments.post_id
FROM   comments
WHERE  comments.is_active AND comments.is_show 
AND    comments.post_id in (1, 7, 9, 11, 3)
GROUP BY comments.id
ORDER BY comments.id
LIMIT 3   <----- this is not working correctly!

I want to get the top 3 comments for each post id in the given list (1, 7, 9, 11, 3).
How to achieve this?

CodePudding user response:

This will nuke the performance of solutions suggested so far:

SELECT c.*
FROM   unnest('{1, 7, 9, 11, 3}'::int[]) AS p(post_id)  -- assuming integer?
CROSS  JOIN LATERAL (
   SELECT id, body, created_at, like_count, post_id
   FROM   comments c
   WHERE  is_active
   AND    is_show 
   AND    c.post_id = p.post_id
   ORDER  BY like_count DESC NULLS LAST  -- assuming this defines "top"?
   LIMIT  3
   ) c
ORDER  BY id, like_count DESC NULLS LAST;  -- see below

Needs an index on (post_id, like_count) to be fast.

Unlike slow solutions with row_number(), which have to scan the whole comments table, this can identify the top 3 comments per post_id from the index cheaply. For big tables, this is faster by orders of magnitude.

See:

If like_count is defined NOT NULL (as it probably should), you can simplify to ORDER BY id, like_count DESC. Else you need DESC NULLS LAST, and the optimal index is on (post_id, like_count DESC NULLS LAST). About that:

CodePudding user response:

One option would be to use row_number. Something like:

SELECT c.id, c.body, c.created_at, c.like_count, c.post_id
FROM (
 SELECT comments.*, row_number() OVER (PARTITION BY post_id ORDER BY id) rn
 FROM comments
 WHERE comments.is_active AND comments.is_show 
  AND  comments.post_id in (1, 7, 9, 11, 3)
) c
WHERE rn <= 3

CodePudding user response:

You can use a ROW_NUMBER to get 3 comments for every postid

WITH CTE AS (
SELECT comments.id, comments.body, comments.created_at
       , comments.like_count, comments.post_id
       ,ROW_NUMBER() OVER( PARTITION BY post_id  ORDER BY id) rn
 FROM comments
 WHERE comments.is_active AND comments.is_show 
       AND  comments.post_id in (1, 7, 9, 11, 3)
)
SELECT id, body, created_at, like_count, post_id
FROM CTE WHERE rn < 4
ORDER BY id
  • Related