One of the test questions came by with following schemas, to look for the best doctor in terms of:
- Best scored;
- The most times/attempts;
- For each medical procedures (in terms of name)
[doctor] table
id | first_name | last_name | age |
---|---|---|---|
1 | Phillip | Singleton | 50 |
2 | Heidi | Elliott | 34 |
3 | Beulah | Townsend | 35 |
4 | Gary | Pena | 36 |
5 | Doug | Lowe | 45 |
[medical_procedure] table
id | doctor_id | name | score |
---|---|---|---|
1 | 3 | colonoscopy | 44 |
2 | 1 | colonoscopy | 37 |
3 | 4 | ulcer surgery | 98 |
4 | 2 | angiography | 79 |
5 | 3 | angiography | 84 |
6 | 3 | embolization | 87 |
and list goes on...
Given solution as follow:
WITH cte AS(
SELECT
name,
first_name,
last_name,
COUNT(*) AS procedure_count,
RANK() OVER(
PARTITION BY name
ORDER BY COUNT(*) DESC) AS place
FROM
medical_procedure p JOIN doctor d
ON p.doctor_id = d.id
WHERE
score >= (
SELECT AVG(score)
FROM medical_procedure pp
WHERE pp.name = p.name)
GROUP BY
name,
first_name,
last_name
)
SELECT
name,
first_name,
last_name
FROM cte
WHERE place = 1;
It'll mean a lot to be clarified on/explain on how the WHERE clause worked out under the subquery:
- How it worked out in general
- Why must we match the two pp.name and p.name for it to reflect the correct rows...
...
WHERE
score >= (
SELECT AVG(score)
FROM medical_procedure pp
WHERE pp.name = p.name)
...
Thanks a heap!
CodePudding user response:
Above is join with doctor and medical procedure and group by procedure name and you need doctor names with most attempt and best scored. Subquery will join by procedure avg score and those who have better score than avg will be filtered.
Now there can be multiple doctor better than avg so taken rank by procedure count so most attempted will come first and then you taken first to pick top one
CodePudding user response:
clarified on/explain on how the WHERE clause worked out under the subquery:
Your subquery says "where the doctor's score is above average for that procedure", and that's not part of the question. The subquery is unnecessary.
Your main query's RANK()
function is also incorrect. You're ranking by number of procedures done (ORDER BY COUNT(*) DESC
), but you really should rank by the doctor's average score in each procedure.
I would approach the issue like so:
select
name,
doctor_id,
avg(score) avg_score,
count(*) procedure_count,
rank() over (partition by name order by avg(score) desc) place
from
medical_procedure
group by
name,
doctor_id
This gives us a single row for each procedure and doctor, ranked by how well doctors do on average for each procedure.
Now it's easy to produce the output you want
with
procedure_ranked as (
-- the above query
)
select
p.name,
d.first_name,
d.last_name,
p.procedure_count
from
procedure_ranked p
inner join doctor d on d.id = p.doctor_id
where
p.place = 1;