I want to assign the rank to a particular column as per the order of other column for each distinct-CodePudding

I want to rank 'drug_name' as per the order of 'svcdate' for each 'patient_id'.

I have attached sample desired output in the image,

To do so I've tried using the following query,

select *,
dense_rank() over(partition by PATIENT_ID,drug_name order by PATIENT_ID) as rnk 
from table
order by PATIENT_ID, svcdate;

Although it's not giving me the output which is mentioned in the image. Please help me to get the desired output. Thanks!!

CodePudding user response：

I think you are really close already. You want your rank to only reset to 1 when encountering a new patient_id, so only include that in your partition by. Then order your ranking by svcdate, to get them in date order, and lastly by drug name. This will rank the same drug filled multiple times on the same date as equals.

select *,
dense_rank() over(partition by PATIENT_ID order by svcdate, drug_name) as rnk 
from table
order by PATIENT_ID, svcdate;

CodePudding user response：

This answer is using Snowflake syntax, it might run on other system.

One way to read:

I want to rank 'drug_name' as per the order of 'svcdate' for each 'patient_id'.

implies:

,dense_rank() over(partition by patient_id order by svcdate) as rank

but that gives:

PATIENT_ID	SVCDATE	DRUG_NAME	RANK
110	2020-04-08	RUCONEST	1
110	2020-04-29	FIRAZYR	2
110	2020-05-27	FIRAZYR	3
110	2020-06-11	TAKHZYRO	4
120	2019-08-04	TAKHZYRO	1
120	2019-08-28	KALBITOR	2
120	2019-08-28	KALBITOR	2
120	2020-06-30	TAKHZYRO	3
120	2020-08-04	KALBITOR	4

but if we rank the drug name by their first/earliest svcdate for each patient_id:

we can use:

select patient_id
    ,svcdate
    ,drug_name
    ,dense_rank() over(partition by patient_id order by first_date) as rank
from (
    select *
        ,first_value(svcdate) over (
           partition by patient_id, drug_name 
           order by svcdate) as first_date
    from data
)
order by 1,2;

which gives:

PATIENT_ID	SVCDATE	DRUG_NAME	RANK
110	2020-04-08	RUCONEST	1
110	2020-04-29	FIRAZYR	2
110	2020-05-27	FIRAZYR	2
110	2020-06-11	TAKHZYRO	3
120	2019-08-04	TAKHZYRO	1
120	2019-08-28	KALBITOR	2
120	2019-08-28	KALBITOR	2
120	2020-06-30	TAKHZYRO	1
120	2020-08-04	KALBITOR	2

These results match for patient 110, but for 120, you have the same two drugs split almost over years:

so if we use:

select patient_id
    ,svcdate
    ,drug_name
    ,dense_rank() over(partition by patient_id order by first_date) as rank
from (
    select *
        ,first_value(svcdate) over 
            (partition by patient_id, drug_name, year(svcdate) 
             order by svcdate) as first_date
    from data
)
order by 1,2;

PATIENT_ID	SVCDATE	DRUG_NAME	RANK
110	2020-04-08	RUCONEST	1
110	2020-04-29	FIRAZYR	2
110	2020-05-27	FIRAZYR	2
110	2020-06-11	TAKHZYRO	3
120	2019-08-04	TAKHZYRO	1
120	2019-08-28	KALBITOR	2
120	2019-08-28	KALBITOR	2
120	2020-06-30	TAKHZYRO	3
120	2020-08-04	KALBITOR	4

which now matches both patients.