Home > OS >  Rolling sum where category values are replaced/updated
Rolling sum where category values are replaced/updated

Time:11-18

I'm trying to get a rolling sum working where categories/groups Amount change at any given date - when the change happens that category's new value becomes apart of the rolling sum, but the previous value of that category is then ignored; so it's a rolling sum, but only based on the latest of each category (at that point in time).

Example data (SumAmount being the problem trying to solve)

txn_id | cust_id | trans_date |  Category | amount | SumAmount
-----------------------------------------------------------------
     1 |       1 |  2020-01-01|  Ball     |      5 |   5     --first tran so sum is 5
     2 |       1 |  2020-01-02|  Cup      |      5 |   10    --sum is 10 (ball=5,Cup=5)
     3 |       1 |  2020-01-03|  Ball     |      2 |   7     --sum is 7 (ball=2,Cup=5) 
     4 |       1 |  2020-02-04|  Ball     |      4 |   9     --sum is 9 (ball=4,Cup=5)
     5 |       1 |  2020-02-05|  Ball     |      1 |   6     --sum is 6 (ball=1,Cup=5)
     6 |       1 |  2020-02-06|  Cup      |      10|   11    --sum is 11(ball=1,Cup=10)
     7 |       1 |  2020-02-07|  Phone    |      5 |   16    --sum is 16(ball=1,Cup=10,Phone=5)
     8 |       1 |  2020-02-08|  Cup      |      5 |   11    --sum is 11(ball=1,Cup=5,Phone=5)
     9 |       1 |  2020-02-09|  Ball     |      5 |   15    --sum is 15(ball=5,Cup=5,Phone=5)

I've got this working within a cursor but wondering if SET based is possible

The cursor goes like:

CREATE PROCEDURE [dbo].[PriceHistory](@CustId int, @MaxPriceHistory decimal(16,2) Output)
create table #PriceHistory ( CategoryID uniqueidentifier, Amount decimal(16,2))

    declare pricehistory_cursor CURSOR FOR
    select CategoryID, Amount
    from mytable
    where CustId =@CustId
    order by trans_date;

    declare @CategoryID uniqueidentifier
    declare @Amount decimal(16,2)
    declare @CurrentTotal decimal(16,2)

    set @MaxPriceHistory = 0

    open pricehistory_cursor
    fetch next from pricehistory_cursor into @CategoryID, @Amount

    WHILE @@FETCH_STATUS = 0  
    BEGIN 
           if (exists(select * from #PriceHistory where CategoryID = @CategoryID))
                  update #PriceHistory set Amount = @actualAmount where CategoryID = @CategoryID 
           else
                  insert into #PriceHistory(CategoryID,Amount) values (@CategoryID, @Amount)

           select @CurrentTotal = sum(Amount) from #PriceHistory

           if (@CurrentTotal > @MaxPriceHistory)
                  set @MaxPriceHistory = @CurrentTotal

           fetch next from pricehistory_cursor into @CategoryID, @Amount                 
    END

    close pricehistory_cursor
    deallocate pricehistory_cursor;

Ultimately, I'm looking for the Max SumAmount throughout the life of the transactions (column SumAmount in provided example), in which for this example it would be 16.

I get what the cursor is doing and I get why it works that way (replacing Amount for that certain Category if already exists (this is the bit I'm stumped on for SET based approach, how would I get the Cup amount of 5, when txn_id = 5 happens?), and summing it up along with all other latest category amounts made at that point in time), I just can't get my head around if it's possible to do with either some kind of recursive CTE or ROW_NUMBER.

CodePudding user response:

Since the data is in a fresh new temporary table, that also means that the primary key doesn't have gaps.

That's a nice situation for a recursive CTE.

The query below keeps the latest amount for the balls, cups and phones.
Then the calculation of the sum simply depends on the category.

WITH RCTE_BALL_CUP_PHONE AS
(
   SELECT txn_id, cust_id, trans_date, category, amount
   , CASE WHEN category = 'Ball'  THEN amount ELSE 0 END AS NearestBallAmount
   , CASE WHEN category = 'Cup'   THEN amount ELSE 0 END AS NearestCupAmount
   , CASE WHEN category = 'Phone' THEN amount ELSE 0 END AS NearestPhoneAmount
   , amount AS SumAmount
   FROM  #PriceHistory AS tmp
   WHERE txn_id = 1

   UNION ALL

   SELECT tmp.txn_id, tmp.cust_id, tmp.trans_date, tmp.category, tmp.amount
   , CASE WHEN tmp.category = 'Ball'  THEN tmp.amount ELSE c.NearestBallAmount END
   , CASE WHEN tmp.category = 'Cup'   THEN tmp.amount ELSE c.NearestCupAmount END
   , CASE WHEN tmp.category = 'Phone' THEN tmp.amount ELSE c.NearestPhoneAmount END
   , CASE 
     WHEN tmp.category = 'Ball'  THEN (tmp.amount   c.NearestCupAmount    c.NearestPhoneAmount)
     WHEN tmp.category = 'Cup'   THEN (tmp.amount   c.NearestBallAmount   c.NearestPhoneAmount)
     WHEN tmp.category = 'Phone' THEN (tmp.amount   c.NearestCupAmount    c.NearestBallAmount)
     ELSE tmp.Amount
     END
   FROM RCTE_BALL_CUP_PHONE c
   JOIN #PriceHistory AS tmp 
     ON tmp.txn_id = c.txn_id   1
)
SELECT txn_id, cust_id, trans_date, category, amount
, SumAmount
FROM RCTE_BALL_CUP_PHONE
ORDER BY txn_id;
txn_id | cust_id | trans_date | category | amount | SumAmount
-----: | ------: | :--------- | :------- | -----: | --------:
     1 |       1 | 2020-01-01 | Ball     |      5 |         5
     2 |       1 | 2020-01-02 | Cup      |      5 |        10
     3 |       1 | 2020-01-03 | Ball     |      2 |         7
     4 |       1 | 2020-02-04 | Ball     |      4 |         9
     5 |       1 | 2020-02-05 | Ball     |      1 |         6
     6 |       1 | 2020-02-06 | Cup      |     10 |        11
     7 |       1 | 2020-02-07 | Phone    |      5 |        16
     8 |       1 | 2020-02-08 | Cup      |      5 |        11
     9 |       1 | 2020-02-09 | Ball     |      5 |        15

db<>fiddle here

  • Related