Running total by date/ID based on latest change to value SQL-CodePudding

I have a unique case where I want to calculate the running total of quantities day over day. I have been searching a lot but couldn't find the right answer. Code-wise, there is nothing much I can share as it refers to a lot of sensitive data

Below is the table of dummy data:

As you can see, there are multiple duplicate IDs by date. I want to be able to calculate the running total of a date as follows:

For 2022/03/24, the running total would be 9 33 = 42, on 2022/03/26 the running total should be 9 31 = 40. Essentially, the running total for any given day should pick the last value by ID if it changed or the value that exists. In this case on 2022/03/26 for that date, for ID 2072, we pick 31 and not 33 because that's the latest value available.

Expected Output:

There maybe be many days spanning across and the running total needs to be day over day.

Possible related question: SQL Server running total based on change of state of a column

PS: For context, ID is just a unique identifier for an inventory of items. Each item's quantity changes day by day. In this example, ID 1's inventoyr last changed on 2022/03/24 where as ID 2072's changed multiple times. Running total for 2022/03/24 would be quantities of inventory items on that day. On 26th there are no changes for ID 1 but ID 2072 changed, the inventory pool should reflect the total as current inventory size of ID 2072 current size of ID 1. On 26th, again ID 1 did not have any change, but ID 2072 changed. Therefore inventory size = current size of ID 2072 current size of ID 1, in this case, 40. Essentially, it is just a current size of inventory with day over day change.

Any help would be really appreciated! Thanks.

CodePudding user response：

If I understand correctly you can try to use COUNT window function get duplicate ID then do self-join

SELECT t1.dates,
       SUM(t1.Quantity   ISNULL(t2.Quantity,0))
FROM (
  SELECT *,
       count(*) over(partition by ID) cnt
  FROM T
) t1  
LEFT JOIN T t2
ON t1.dates >= t2.dates AND t1.id <> t2.id
WHERE t1.cnt > 1
GROUP BY t1.dates

sqlfiddle

CodePudding user response：

I added a few more rows just in case if this is what you really wanted.

I used T-SQL.

declare @orig table(
id          int,
quantity    int,
rundate     date
)


insert into @orig
values (1,9,'20220324'),(2072,33,'20220324'),(2072,31,'20220326'),(2072,31,'20220327'),
(2,10,'20220301'),(2,20,'20220325'),(2,30,'20220327')


declare @dates table (
runningdate date
)

insert into @dates
select distinct rundate from @orig
order by rundate


declare @result table (
dates               date,
running_quality     int
)


DECLARE @mydate date
DECLARE @sum int

-- CURSOR definition
DECLARE my_cursor CURSOR FOR  
SELECT * FROM @dates

OPEN my_cursor

-- Perform the first fetch
FETCH NEXT FROM my_cursor into @mydate

-- Check @@FETCH_STATUS to see if there are any more rows to fetch
WHILE @@FETCH_STATUS = 0
BEGIN

;with cte as (
select * from @orig
where rundate <= @mydate
), cte2 as (
select id, max(rundate) as maxrundate
from cte
group by id
), cte3 as (
select a.*
from cte as a join cte2 as b
on a.id = b.id and a.rundate = b.maxrundate
)
select @sum = sum(quantity)
from cte3

insert into @result
select @mydate, @sum


-- This is executed as long as the previous fetch succeeds
FETCH NEXT FROM my_cursor into @mydate

END -- cursor

CLOSE my_cursor
DEALLOCATE my_cursor

select * from @result

Result:

dates running_quality 2022-03-01 10 2022-03-24 52 2022-03-25 62 2022-03-26 60 2022-03-27 70