Home > database >  sql loop with id, date, and 1 additional field (in redshift)
sql loop with id, date, and 1 additional field (in redshift)

Time:01-19

The scenario is: A patient is taking multiple medications (denoted by ndc) Each medication has a fill date with a corresponding fill_nbr. eg: each ndc has fill_nbr = 1 as the initial fill. NDCs may or not have refills: fill_nbr = 2 is the 2nd fill (or 1st refill) , fill_nbr = 3 is the 3rd fill (or 2nd refill), etc.

I need the datediff between each fill_nbr and the next fill_nbr for each ndc - by patient. If there is no refill, the days_between_refills should be 0.

Using dbeaver/redshift.

The end result needs to look like this:

| patient_id |  rx_date  | fill_nbr |    ndc     | days_between_refills |
|   654654   | 01/13/2021|    1     | 1565475214 |          0           |      
|   654654   | 01/18/2021|    2     | 1565475214 |          5           |
|   654654   | 01/21/2021|    3     | 1565475214 |          3           |
|   654654   | 02/02/2022|    1     | 2449499446 |          0           |
|   654654   | 02/09/2022|    2     | 2449499446 |          7           |
|   654654   | 02/19/2022|    3     | 2449499446 |         10           |
|   654654   | 02/29/2022|    4     | 2449499446 |          9           |
|   832156   | 03/01/2020|    1     | 3489446548 |          0           |
|   832156   | 03/12/2020|    2     | 3489446548 |         12           |
|   832156   | 03/20/2020|    3     | 3489446548 |          8           |
|   832156   | 03/26/2020|    4     | 3489446548 |          6           |

The 2 issues I would like to solve are:

1. loop by patient_id/ndc

2. all datediff numbers in 1 column: "days_between_fill"

this script works:

t1.ndc,DATEDIFF(day,t1.rx_date,t2.rx_date) as days_between_fill
FROM sandbox.table1 t1
INNER JOIN
sandbox.table2 t2
on
t1.ndc = t2.ndc
WHERE 
t1.patient_id = t2.patient_id 
and
t1.fill_nbr  = 1 and t2.fill_nbr  = 2 

but it needs a loop by patientid/ndc as there are many patients with >200 refills of the same ndc

and I do **NOT** want to write this 200  times...

t1.fill_nbr  = 1 and t2.fill_nbr  = 2
t1.fill_nbr  = 2 and t2.fill_nbr  = 3
t1.fill_nbr  = 3 and t2.fill_nbr  = 4

etc.......

CodePudding user response:

Your need to solve this by LAG function.

Please check enter image description here

CodePudding user response:

select *,   lag(rx_date, 1) over (partition by patient_id, ndc order by fill_nbr) as last_refill_date , coalesce(datediff(day, lag(rx_date, 1) over (partition by patient_id, ndc order by fill_nbr), rx_date), 0) as days_between_refills from table order by 1, 2, 3 ) ; 
  • Related