I have this SQL query to get related comments for each ID in a list:
SELECT comments.id, comments.body, comments.created_at
, comments.like_count, comments.post_id
FROM comments
WHERE comments.is_active AND comments.is_show
AND comments.post_id in (1, 7, 9, 11, 3)
GROUP BY comments.id
ORDER BY comments.id
LIMIT 3 <----- this is not working correctly!
I want to get the top 3 comments for each post id in the given list (1, 7, 9, 11, 3).
How to achieve this?
CodePudding user response:
This will nuke the performance of solutions suggested so far:
SELECT c.*
FROM unnest('{1, 7, 9, 11, 3}'::int[]) AS p(post_id) -- assuming integer?
CROSS JOIN LATERAL (
SELECT id, body, created_at, like_count, post_id
FROM comments c
WHERE is_active
AND is_show
AND c.post_id = p.post_id
ORDER BY like_count DESC NULLS LAST -- assuming this defines "top"?
LIMIT 3
) c
ORDER BY id, like_count DESC NULLS LAST; -- see below
Needs an index on (post_id, like_count)
to be fast.
Unlike slow solutions with row_number()
, which have to scan the whole comments table, this can identify the top 3 comments per post_id
from the index cheaply. For big tables, this is faster by orders of magnitude.
See:
- Limit number of rows per group from join (NOT to 1 row)
- Optimize GROUP BY query to retrieve latest row per user
If like_count
is defined NOT NULL
(as it probably should), you can simplify to ORDER BY id, like_count DESC
. Else you need DESC NULLS LAST
, and the optimal index is on (post_id, like_count DESC NULLS LAST)
. About that:
- Sort by column ASC, but NULL values first?
- Performance impact of view on aggregate function vs result set limiting
CodePudding user response:
One option would be to use row_number. Something like:
SELECT c.id, c.body, c.created_at, c.like_count, c.post_id
FROM (
SELECT comments.*, row_number() OVER (PARTITION BY post_id ORDER BY id) rn
FROM comments
WHERE comments.is_active AND comments.is_show
AND comments.post_id in (1, 7, 9, 11, 3)
) c
WHERE rn <= 3
CodePudding user response:
You can use a ROW_NUMBER to get 3 comments for every postid
WITH CTE AS (
SELECT comments.id, comments.body, comments.created_at
, comments.like_count, comments.post_id
,ROW_NUMBER() OVER( PARTITION BY post_id ORDER BY id) rn
FROM comments
WHERE comments.is_active AND comments.is_show
AND comments.post_id in (1, 7, 9, 11, 3)
)
SELECT id, body, created_at, like_count, post_id
FROM CTE WHERE rn < 4
ORDER BY id