This is the table
id | category | value |
---|---|---|
1 | A | 40 |
1 | B | 20 |
1 | C | 10 |
2 | A | 4 |
2 | B | 7 |
2 | C | 7 |
3 | A | 32 |
3 | B | 21 |
3 | C | 2 |
I want the result like this
id | category |
---|---|
1 | A |
2 | B |
2 | C |
3 | A |
CodePudding user response:
For small tables or for only very few rows per user, a subquery with the window function rank()
(as demonstrated by The Impaler) is just fine. The resulting sequential scan over the whole table, followed by a sort will be the most efficient query plan.
For more than a few rows per user, this gets increasingly inefficient though.
Typically, you also have a users
table holding one distinct row per user. We can leverage that for an alternative query that scales much better - using WITH TIES
in a LATERAL JOIN
.
Requires Postgres 13 or later.
SELECT u.id, t.*
FROM users u
CROSS JOIN LATERAL (
SELECT t.category
FROM tbl t
WHERE t.id = u.id
ORDER BY t.value DESC
FETCH FIRST 1 ROWS WITH TIES -- !
) t;
See:
- Greater than or equal to ALL() and equal to MAX() speed
- Fetching a minimum of N rows, plus all peers of the last row
This can use a multicolumn index to great effect - which must exist, of course:
CREATE INDEX ON tbl (id, value);
Or:
CREATE INDEX ON tbl (id, value DESC);
Even faster index-only scans become possible with:
CREATE INDEX ON tbl (id, value DESC, category);
Or (the optimum for the query at hand):
CREATE INDEX ON tbl (id, value DESC) INCLUDE (category);
Assuming value
is defined NOT NULL
, or we have to use DESC NULLS LAST
. See:
To keep users in the result that don't have any rows in table tbl
, user LEFT JOIN LATERAL (...) ON true
. See:
CodePudding user response:
You can use RANK()
to identify the rows you want. Then, filtering is easy. For example:
select *
from (
select *,
rank() over(partition by id order by value desc) as rk
from t
) x
where rk = 1
Result:
id category value rk
--- --------- ------ --
1 A 40 1
2 B 7 1
2 C 7 1
3 A 32 1
See running example at DB Fiddle.