Home > OS >  How to use window function to aggregate data based on date or rank column?
How to use window function to aggregate data based on date or rank column?

Time:04-07

So I have a list of shipments and I have the the order total and the total for each individual shipment, but I'm struggling to come up with the code to create an additional column for cumulative shipments, which would include the current shipment, plus all previous shipments for that order. Here's a result of what I have so far:

OrderNo |ShipDate   |OrderTotal |Shipment Total |Cumulative Shipments |Rank
22396   |2022-04-04 |639,964    |2,983          |639,966              |3
22396   |2022-03-31 |639,964    |5,626          |639,966              |2
22396   |2022-02-24 |639,964    |631,355        |639,966              |1

So these are 3 separate shipments for the same order. The 1st shipments in row 3 is correct, but I need the cumulative shipments column for row 2 to be the shipments total sum of both, so $631,555 5,626. Following that same logic, row 1 should be the sum of all 3, which at that point would be equal to the order total of $639,964. Here's what that would look like:

OrderNo |ShipDate   |OrderTotal |Shipment Total |Cumulative Shipments |Rank
22396   |2022-04-04 |639,964    |2,983          |639,964              |3
22396   |2022-03-31 |639,964    |5,626          |636,981              |2
22396   |2022-02-24 |639,964    |631,355        |631,355              |1

I'm assuming the best way to accomplish this is using over(partition by ()), but I'm struggling to come up with the code. Here's what I have so far:

SELECT
    OrderNo,
    ShipDate,
    OrderTotal,
    [Shipment Total],
    SUM([Shipment Total]) OVER(PARTITION BY OrderNo) AS [Cumulative Shipments],
    [Rank]
FROM Shipments
WHERE OrderNo = '22396'

The [Rank] column is from an earlier CTE which calculates the rank of that shipment based on shipdate:

ROW_NUMBER() OVER(PARTITION BY d.OrderNo ORDER BY d.ShipDate) AS [Rank]

I need something like SUM([Shipment Total]) where rank is equal or less than the current rank. Same thing can be accomplished with the date column I'm sure, but just not sure how to finish the query

CodePudding user response:

You seem to be half way there just missing an ordering criteria for a functioning cumulative sum, such as

select *, 
  Sum(ShipmentTotal) 
    over(partition by OrderNo 
         order by ShipDate rows between unbounded preceding and current row)
from Shipments;

CodePudding user response:

I didn't bother building the CTE, but assuming you have some PK, you can self-join:

SELECT s.ShipmentsID, SUM(cumulative.ShipmentTotal) AS Sum
FROM Shipments s
LEFT JOIN Shipments cumulative ON cumulative.ShipDate <= s.ShipDate
GROUP BY s.ShipmentsID 
ShipmentsID Sum
1 631355.52
2 636982.29
3 639966.04
  • Related