Seemingly similar to How to get the top 10 values in postgresql?, yet somehow very different.
We'll set it up similar to that question:
I have a postgresql table: Scores(score integer)
.
How would I get the highest 99% of the scores? We cannot say that we know beforehand how many rows there are, so we can't use the same limit to an integer trick. SQL Server has an easy SELECT TOP
syntax -- is there anything similarly simple in the postgresql world?
CodePudding user response:
This should be doable with percent_rank()
select score
from (
select score, percent_rank() over (order by score desc) as pct_rank
from scores
) t
where pct_rank <= 0.99
CodePudding user response:
you can use the ntile function to partition the rows into percentiles and then select the rows where tile > 99
example:
-- following query generates 1000 rows with random
-- scores and selects the 99th percentile using the ntile function.
-- because the chance of the same random value appearing twice is extremely
-- small, the result should in most cases yield 10 rows.
with scores as (
select
id
, random() score
from generate_series(1, 1000) id
)
, percentiles AS (
select
*
, ntile(100) over (order by score) tile
from scores
)
select
id
, score
from percentiles
where tile > 99