Home > Net >  Return the provider with the most visits out of the past 7 visits for each patient each month
Return the provider with the most visits out of the past 7 visits for each patient each month

Time:12-17

Currently have a dataset of 'visits' with the day, patient, provider, the rank of the appointment (sequential number of that appointment for that patient) and rank of the appointment/provider (sequential number of that appointment for that patient/provider)

Need to use this to do a lookback on the first of each month (1/1/2020 start), for each patient look at the prior 7 visits (with no limit on how far this is) and grab the provider that had the majority of those visits to associate with that patient for that month.

If the patient only had less than 7 visits, still apply this logic. If the patient has an even number of visits that leads to a tie of these providers, the winner will be the most recent one. Patients may not have any appointments for that month period.

So for example patient 'A' the base data is like this...

ApptDate | Patient | Provider | ApptRank | ApptProvRank |
1/15/2020|    A    |    BOB   |    1     |      1       |
3/01/2020|    A    |    BOB   |    2     |      2       |
3/08/2020|    A    |    BOB   |    3     |      3       |
3/20/2020|    A    |    BOB   |    4     |      4       |
4/01/2020|    A    |    BOB   |    5     |      5       |
4/15/2020|    A    |    SMITH |    6     |      1       |
6/07/2020|    A    |    SMITH |    7     |      2       |
6/21/2020|    A    |    BOB   |    8     |      6       |
7/01/2020|    A    |    JANE  |    9     |      1       |
7/15/2020|    A    |    JANE  |    10    |      2       |
7/21/2020|    A    |    JANE  |    11    |      3       |
8/01/2020|    A    |    JANE  |    12    |      4       |
8/15/2020|    A    |    JANE  |    13    |      5       |
9/20/2020|    A    |    JANE  |    14    |      6       |

I then have a process that fills in missing months and counts the number of appts within that month for each provider and that looks like this...

Month    | Patient | Provider |   Appts    |Cumulative Appt Rank | Cumulative Appt Provider Rank
1/1/2020 |     A   |    BOB   |     1      |   1                 |   1
2/1/2020 |     A   |    NULL  |    NULL    |  NULL               |   NULL
3/1/2020 |     A   |    BOB   |     3      |   4                 |   4
4/1/2020 |     A   |    BOB   |     1      |   5                 |   5
4/1/2020 |     A   |    SMITH |     1      |   6                 |   1
5/1/2020 |     A   |    NULL  |    NULL    |  NULL               |   NULL
6/1/2020 |     A   |    SMITH |     1      |   7                 |   2
6/1/2020 |     A   |    BOB   |     1      |   8                 |   6
7/1/2020 |     A   |    JANE  |     3      |   11                |   3
8/1/2020 |     A   |    JANE  |     2      |   13                |   5
9/1/2020 |     A   |    JANE  |     2      |   14                |   6

What I want is something that looks like this: (ONE month ONE provider and the share of Appts out of the 7 prior - which should never be higher than 7)

Month    | Patient | Provider | Appt Count (of Past 7)
2/1/2020 |     A   |    BOB   |     1      |            ----\> starting 2/1/2020 for the 'lookback'
3/1/2020 |     A   |    BOB   |     1      |
4/1/2020 |     A   |    BOB   |     4      |
5/1/2020 |     A   |    BOB   |     5      |  
6/1/2020 |     A   |    BOB   |     5      |  
7/1/2020 |     A   |    BOB   |     5      |  
8/1/2020 |     A   |    JANE  |     3      |  
9/1/2020 |     A   |    JANE  |     5      |  
10/1/2020|     A   |    JANE  |     6      |

This would involve multiple patients, but to simplify for this example it's just one. Greatly appreciate any help on this one

CodePudding user response:

After converting your example data into DDL/DML:

DECLARE @Visits TABLE (ID INT IDENTITY, ApptDate DATETIME, Patient NVARCHAR(10), Provider NVARCHAR(10), ApptRank INT, ApptProvRank INT);
INSERT INTO @Visits (ApptDate, Patient, Provider, ApptRank, ApptProvRank) VALUES
('1/15/2020',    'A',    'BOB   ',    1 ,      1),
('3/01/2020',    'A',    'BOB   ',    2 ,      2),
('3/08/2020',    'A',    'BOB   ',    3 ,      3),
('3/20/2020',    'A',    'BOB   ',    4 ,      4),
('4/01/2020',    'A',    'BOB   ',    5 ,      5),
('4/15/2020',    'A',    'SMITH ',    6 ,      1),
('6/07/2020',    'A',    'SMITH ',    7 ,      2),
('6/21/2020',    'A',    'BOB   ',    8 ,      6),
('7/01/2020',    'A',    'JANE  ',    9 ,      1),
('7/15/2020',    'A',    'JANE  ',    10,      2),
('7/21/2020',    'A',    'JANE  ',    11,      3),
('8/01/2020',    'A',    'JANE  ',    12,      4),
('8/15/2020',    'A',    'JANE  ',    13,      5),
('9/20/2020',    'A',    'JANE  ',    14,      6),

('1/15/2021',    'B',    'BOB   ',    1 ,      1),
('3/01/2021',    'B',    'BOB   ',    2 ,      2),
('3/08/2021',    'B',    'BOB   ',    3 ,      3),
('3/20/2021',    'B',    'BOB   ',    4 ,      4),
('4/01/2021',    'B',    'BOB   ',    5 ,      5),
('4/15/2021',    'B',    'SMITH ',    6 ,      1),
('6/07/2021',    'B',    'SMITH ',    7 ,      2),
('6/21/2021',    'B',    'JANE  ',    8 ,      6),
('7/01/2021',    'B',    'JANE  ',    9 ,      1),
('7/15/2021',    'B',    'JANE  ',    10,      2),
('7/21/2021',    'B',    'JANE  ',    11,      3),
('8/01/2021',    'B',    'JANE  ',    12,      4),
('8/15/2021',    'B',    'JANE  ',    13,      5),
('9/20/2021',    'B',    'JANE  ',    14,      6);

We can look at this data via windowed functions to determine the most popular provider for a patient over the last 7 visits (regardless of duration) and then join back to it to see the most popular provider for the 7 visit window for each appointment.

;WITH top7VisitProvider AS (
SELECT patient, a.ApptRank, a.v7Provider, a.ProviderLast7Cnt, ROW_NUMBER() OVER (PARTITION BY a.Patient, a.ApptRank ORDER BY a.ProviderLast7Cnt DESC) AS rn
  FROM (
        SELECT v.ID, v.ApptDate, v.Patient, v.Provider, v.ApptRank, v.ApptProvRank, vr.Provider AS v7Provider, COUNT(*) AS ProviderLast7Cnt
          FROM @Visits v
            LEFT OUTER JOIN @Visits vr
              ON v.Patient = vr.Patient
              AND vr.ApptRank BETWEEN v.ApptRank-7 AND v.ApptRank
        GROUP BY v.ID, v.ApptDate, v.Patient, v.Provider, v.ApptRank, v.ApptProvRank, vr.Provider
       ) a
)

SELECT v.ID, v.ApptDate, v.Patient, v.Provider, v.ApptRank, v.ApptProvRank, t7vp.v7Provider
  FROM @Visits v
    INNER JOIN top7VisitProvider t7vp
      ON v.Patient = t7vp.Patient
      AND v.ApptRank = t7vp.ApptRank
      AND t7vp.rn = 1
ID  ApptDate                Patient Provider    ApptRank    ApptProvRank    v7Provider
--------------------------------------------------------------------------------------
1   2020-01-15 00:00:00.000 A       BOB         1           1               BOB   
2   2020-03-01 00:00:00.000 A       BOB         2           2               BOB   
3   2020-03-08 00:00:00.000 A       BOB         3           3               BOB   
4   2020-03-20 00:00:00.000 A       BOB         4           4               BOB   
5   2020-04-01 00:00:00.000 A       BOB         5           5               BOB   
6   2020-04-15 00:00:00.000 A       SMITH       6           1               BOB   
7   2020-06-07 00:00:00.000 A       SMITH       7           2               BOB   
8   2020-06-21 00:00:00.000 A       BOB         8           6               BOB   
9   2020-07-01 00:00:00.000 A       JANE        9           1               BOB   
10  2020-07-15 00:00:00.000 A       JANE        10          2               BOB   
11  2020-07-21 00:00:00.000 A       JANE        11          3               BOB   
12  2020-08-01 00:00:00.000 A       JANE        12          4               JANE  
13  2020-08-15 00:00:00.000 A       JANE        13          5               JANE  
14  2020-09-20 00:00:00.000 A       JANE        14          6               JANE  
15  2021-01-15 00:00:00.000 B       BOB         1           1               BOB   
16  2021-03-01 00:00:00.000 B       BOB         2           2               BOB   
17  2021-03-08 00:00:00.000 B       BOB         3           3               BOB   
18  2021-03-20 00:00:00.000 B       BOB         4           4               BOB   
19  2021-04-01 00:00:00.000 B       BOB         5           5               BOB   
20  2021-04-15 00:00:00.000 B       SMITH       6           1               BOB   
21  2021-06-07 00:00:00.000 B       SMITH       7           2               BOB   
22  2021-06-21 00:00:00.000 B       JANE        8           6               BOB   
23  2021-07-01 00:00:00.000 B       JANE        9           1               BOB   
24  2021-07-15 00:00:00.000 B       JANE        10          2               BOB   
25  2021-07-21 00:00:00.000 B       JANE        11          3               JANE  
26  2021-08-01 00:00:00.000 B       JANE        12          4               JANE  
27  2021-08-15 00:00:00.000 B       JANE        13          5               JANE  
28  2021-09-20 00:00:00.000 B       JANE        14          6               JANE   

CodePudding user response:

Maybe this script can help as a first step for all those colleagues that are interest to analyze this case:

CREATE table bu_datalake.PatientProvider
(
    ApptDate Date,
    Patient  String,
    Provider String
);

INSERT INTO bu_datalake.PatientProvider
VALUES
('2022-01-15','A','BOB'),
('2022-03-01','A','BOB'),
('2022-03-08','A','BOB'),
('2022-03-20','A','BOB'),
('2022-04-01','A','BOB'),
('2022-04-15','A','SMITH'),
('2022-06-07','A','SMITH'),
('2022-06-21','A','BOB'),
('2022-07-01','A','JANE'),
('2022-07-15','A','JANE'),
('2022-07-21','A','JANE'),
('2022-08-01','A','JANE'),
('2022-08-15','A','JANE'),
('2022-09-20','A','JANE')
  • Related