Home > Net >  Calculating moving sum (or SUM OVER) for the last X months, but with irregular number of rows
Calculating moving sum (or SUM OVER) for the last X months, but with irregular number of rows

Time:07-28

I want to do a window function (like the SUM() OVER() function), but there are two catches:

  1. I want to consider the last 3 months on my moving sum, but the number of rows are not consistent. Some months have 3 entries, others may have 2, 4, 5, etc;
  2. There is also a "group" column, and the moving sum should sum only the amounts of the same group.

In summary, a have a table that has the following structure:

id date group amount
1 2022-01 group A 1100
2 2022-01 group D 2500
3 2022-02 group A 3000
4 2022-02 group B 1000
5 2022-02 group C 2500
6 2022-03 group A 2000
7 2022-04 group C 1000
8 2022-05 group A 1500
9 2022-05 group D 2000
10 2022-06 group B 1000

So, I want to add a moving sum column, containing the sum the ammount for each group for the last 3 months. The sum should not reset every 3 months, but should consider only the previous values from the 3 months prior, and of the same group.

The end result should look like:

id date group amount moving_sum_three_months
1 2022-01 group A 1100 1100
2 2022-01 group D 2500 2500
3 2022-02 group A 3000 4100
4 2022-02 group B 1000 1000
5 2022-02 group C 2500 2500
6 2022-03 group A 2000 6100
7 2022-04 group C 1000 3500
8 2022-05 group A 1500 3500
9 2022-05 group D 2000 2000
10 2022-06 group B 1200 1200

The best example to see how the sum work in this example is line 8.

  • It considers only lines 8 and 6 for the sum, because they are the only one that meet the criteria;
  • Line 1 and 3 do not meet the criteria, because they are more than 3 months old from line 8 date;
  • All the other lines are not from group A, so they are also excluded from the sum.

Any ideias? Thanks in advance for the help!

CodePudding user response:

Use SUM() as a window function partitioning the window by group in RANGE mode. Set the frame to go back 3 months prior the current record using INTERVAL '3 months', e.g.

SELECT *, SUM(amount) OVER w AS moving_sum_three_months
FROM t
WINDOW w AS (PARTITION BY "group" ORDER BY "date"
             RANGE BETWEEN INTERVAL '3 months' PRECEDING AND CURRENT ROW)
ORDER BY id

Demo: db<>fiddle

  • Related