How to aggregate to group level, order and enumerate groups chronologically? (in SQL Server)-CodePudding

Within a hospital encounter, a patient may be administered several different formulations of a medication, as shown here:

Encounter	Medication	Administration	Adm_Num
1	A	8/31/21 11:33 AM	1
1	B	8/31/21 6:25 PM	2
1	C	9/1/21 8:55 AM	3
1	D	9/1/21 10:00 PM	4
1	B	9/2/21 11:27 AM	5
1	B	9/2/21 10:00 PM	6
1	B	9/3/21 6:15 AM	7
1	B	9/3/21 3:30 PM	8
1	D	9/3/21 8:30 PM	9

The task: For each encounter, I need to enumerate each formulation of the medication, like this:

Encounter	Medication	Administration	Adm_Num	Formulation
1	A	8/31/21 11:33 AM	1	1
1	B	8/31/21 6:25 PM	2	2
1	C	9/1/21 8:55 AM	3	3
1	D	9/1/21 10:00 PM	4	4
1	B	9/2/21 11:27 AM	5	5
1	B	9/2/21 10:00 PM	6	5
1	B	9/3/21 6:15 AM	7	5
1	B	9/3/21 3:30 PM	8	5
1	D	9/3/21 8:30 PM	9	6

I need to preserve the administration-level granularity in order to report on those data (and I'm in good shape there).

One strategy I tried was a CTE to find a medication-level value for each medication, such as the first administration instant, then apply that to the more granular administration-level data, like this:

WITH f as (
SELECT a.Encounter, a.Medication, MIN(Administration) as First_Adm,
 DENSE_RANK() OVER (PARTITION Encounter, Medication ORDER BY Administration) as Formulation
FROM a
GROUP BY a.Encounter, a.Medication
)
SELECT a.Encounter, a.Medication, a.Administration, a.Adm_Num, f.[First_Adm], f.Formulation
FROM a
INNER JOIN f ON (a.Encounter = f.Encounter AND a.Medication = f.Medication)
GROUP BY a.Encounter, a.Medication, a.Administration
ORDER BY a.Encounter, a.Medication, a.Administration

That worked well for the first few medications with single administrations, but it mishandled the reappearance of Medications B and D later in the encounter; it recognized Medication B from before, gave it the earlier MIN(Administration), and mislabeled it Formulation 2, as shown here:

Encounter	Medication	Administration	Adm_Num	First_Adm	Formulation
1	A	8/31/21 11:33 AM	1	8/31/21 11:33 AM	1
1	B	8/31/21 6:25 PM	2	8/31/21 6:25 PM	2
1	C	9/1/21 8:55 AM	3	9/1/21 8:55 AM	3
1	D	9/1/21 10:00 PM	4	9/1/21 10:00 PM	4
1	B	9/2/21 11:27 AM	5	8/31/21 6:25 PM	2
1	B	9/2/21 10:00 PM	6	8/31/21 6:25 PM	2
1	B	9/3/21 6:15 AM	7	8/31/21 6:25 PM	2
1	B	9/3/21 3:30 PM	8	8/31/21 6:25 PM	2
1	D	9/3/21 8:30 PM	9	9/1/21 10:00 PM	4

This is close, but not quite there. The remaining challenge is to account for repeated formulations.

How can I group the administration-level data by medication formulation groups, order the formulations chronologically (making use of the time data), and assign a sequential value to each formulation, even if some are repeated?

Thanks in advance for your help.

CodePudding user response：

This is a type of "gaps and islands" problem.

One method is to first use lag or lead to check the adjacent row to identify where the data (Medication) changes, followed by summing these values for all preceding rows for each row in the results to generate the desired sequence:

with c as (
    select *, 
      case when 
        Lag(medication) over (partition by encounter order by Administration) = medication 
      then 0 else 1 end Changed
    from t
)
select Encounter, Medication, Administration, Adm_Num,
  Sum(changed) over(partition by Encounter order by Administration) Formulation
from c;