Home > Software engineering >  Latest value of compared date range? (SQL/Snowflake)
Latest value of compared date range? (SQL/Snowflake)

Time:03-02

I have values in Table-A like:

Patient|Invoice|Date
A,111,2021-02-01
A,222,2021-01-01
B,333,2021-03-01
B,444,2021-02-01
C,555,2021-04-01
C,666,2021-03-01

And values in Table-B like:

Patient|Value|Date
A,2,2021-01-05
A,3,2021-01-05
A,3,2021-02-05
B,1,2021-02-05
B,1,2021-03-05
C,6,2021-01-01

And I want to join the two tables such that I see the most recent cumulative sum of values in Table-B as-of the Date in Table-A for a given Patient.

Patient|Invoice|Latest Value|Date
A,111,5,2021-02-01
A,222,0,2021-01-01
B,333,1,2021-03-01
B,444,0,2021-02-01
C,555,6,2021-04-01
C,666,6,2021-03-01

How would I join these two tables by date to accomplish this?

CodePudding user response:

First step seems like a basic SQL join:

select patient, invoice, sum(value), date
from table1 a
join table2 b
on a.patient=b.patient
and a.date=b.date
group by patient, invoice, date

But instead of a plain sum() you can apply a sum() over():

select patient, invoice
  , sum(value) over(partition by patient order by date)
  , date
from table1 a
join table2 b
on a.patient=b.patient
and a.date=b.date
group by patient, invoice, date

CodePudding user response:

I think that first we need to calculate the time intervals when the invoice is valid (using LAG function), then calculate the cumulative SUM.

WITH A AS (
  SELECT Patient, Invoice, Date, IFNULL(LAG(Date) OVER(PARTITION BY Patient ORDER BY Date), '1900-01-01') AS LG
    FROM Table_A
)
SELECT DISTINCT A.Patient, A.Invoice, IFNULL(SUM(B.Value) OVER(PARTITION BY A.Patient ORDER BY A.Date), 0) AS Latest_Value, A.Date
  FROM A
  LEFT JOIN Table_B AS B 
    ON A.Patient = B.Patient
   AND B.Date >= A.LG AND B.Date < A.Date
GROUP BY A.Patient, A.Invoice, A.Date, B.Value
ORDER BY A.Patient, A.Invoice, A.Date; 
  • Related