I have been trying to solve a problem for a few days now, but I just can't get it solved. Hence my question today. I would like to calculate the running sum in the following table. My result so far looks like this:
PersonID | Visit_date | Medication_intake | Previous_date | Date_diff | Running_sum |
---|---|---|---|---|---|
1 | 2012-04-26 | 1 | |||
1 | 2012-11-16 | 1 | 2012-04-26 | 204 | 204 |
1 | 2013-04-11 | 0 | |||
1 | 2013-07-19 | 1 | |||
1 | 2013-12-05 | 1 | 2013-07-19 | 139 | 343 |
1 | 2014-03-18 | 1 | 2013-12-05 | 103 | 585 |
1 | 2014-06-24 | 0 | |||
2 | 2014-12-01 | 1 | |||
2 | 2015-03-09 | 1 | 2014-12-01 | 98 | 98 |
2 | 2015-09-28 | 0 |
This is my desired result. So only the running sum over contiguous blocks (Medication_intake=1) should be calculated.
PersonID | Visit_date | Medication_intake | Previous_date | Date_diff | Running_sum |
---|---|---|---|---|---|
1 | 2012-04-26 | 1 | |||
1 | 2012-11-16 | 1 | 2012-04-26 | 204 | 204 |
1 | 2013-04-11 | 0 | |||
1 | 2013-07-19 | 1 | |||
1 | 2013-12-05 | 1 | 2013-07-19 | 139 | 139 |
1 | 2014-03-18 | 1 | 2013-12-05 | 103 | 242 |
1 | 2014-06-24 | 0 | |||
2 | 2014-12-01 | 1 | |||
2 | 2015-03-09 | 1 | 2014-12-01 | 98 | 98 |
2 | 2015-09-28 | 0 |
I work with Microsoft SQL Server 2019 Express.
Thank you very much for your tips!
CodePudding user response:
This is a gaps and islands problem, and one approach uses the difference in row numbers method:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY PersonID
ORDER BY Visit_date) rn1,
ROW_NUMBER() OVER (PARTITION BY PersonId, Medication_intake
ORDER BY Visit_date) rn2
FROM yourTable
)
SELECT PersonID, Visit_date, Medication_intake, Previous_date, Date_diff,
CASE WHEN Date_diff IS NOT NULL AND Medication_intake = 1
THEN SUM(Date_diff) OVER (PARTITION BY PersonID, rn1 - rn2
ORDER BY Visit_date) END AS Running_sum
FROM cte
ORDER BY PersonID, Visit_date;
Demo
The CASE
expression in the outer query computes the rolling sum for date diff along islands of records having a medication intake value of 1. For other records, or for records where date diff be null, the value generated is simply null.