Let's say I have two tables called schools
and students
which have a one-to-many relationship. Each student can only be in one school but each school can have many students.
school
-----------
id
name
student
-----------
id
name
school_id
I need to find the 90th percentile of the number of students each school has.
I can already sort the schools based on the student count, but I don't know how to get the percentile.
select school_id, count(id) as count from students
group by school_id
order by count desc
CodePudding user response:
maybe this helps:
select school_id,stud_count, PERCENT_RANK() OVER(partition by school_id ORDER BY stud_count)
from(
select school_id, count(*) as stud_count
from students s
LEFT JOIN school sc ON s.school_id = sc.id
group by school_id
order by count desc)x
CodePudding user response:
I believe you need the percentile_disc
function? Here's an example:
with
-- The following hard-coded data set simulates:
--
-- select count(*) from students group by school_id
--
students_per_school(student_count) as (values
(1), -- 10% = 0.1
(1), -- 20% = 0.2
(1), -- 30% = 0.3
(1), -- 40% = 0.4
(2), -- 50% = 0.5
(3), -- 60% = 0.6
(4), -- 70% = 0.7
(5), -- 80% = 0.8
(6), -- 90% = 0.9
(7) -- 100% = 1.0
)
select
percentile_disc(0.1) within group (order by student_count) as " 10th percentile",
percentile_disc(0.2) within group (order by student_count) as " 20th percentile",
percentile_disc(0.3) within group (order by student_count) as " 30th percentile",
percentile_disc(0.4) within group (order by student_count) as " 40th percentile",
percentile_disc(0.5) within group (order by student_count) as " 50th percentile",
percentile_disc(0.6) within group (order by student_count) as " 60th percentile",
percentile_disc(0.7) within group (order by student_count) as " 70th percentile",
percentile_disc(0.8) within group (order by student_count) as " 80th percentile",
percentile_disc(0.9) within group (order by student_count) as " 90th percentile",
percentile_disc(1.0) within group (order by student_count) as "100th percentile"
from students_per_school;
The above will output (using extended output: \x on
in psql):
┌─[ RECORD 1 ]─────┬───┐
│ 10th percentile │ 1 │
│ 20th percentile │ 1 │
│ 30th percentile │ 1 │
│ 40th percentile │ 1 │
│ 50th percentile │ 2 │
│ 60th percentile │ 3 │
│ 70th percentile │ 4 │
│ 80th percentile │ 5 │
│ 90th percentile │ 6 │
│ 100th percentile │ 7 │
└──────────────────┴───┘
I've also created a fiddle which queries an actual table: https://dbfiddle.uk/vNl-rHdn