I want to do a window function (like the SUM() OVER() function), but there are two catches:
- I want to consider the last 3 months on my moving sum, but the number of rows are not consistent. Some months have 3 entries, others may have 2, 4, 5, etc;
- There is also a "group" column, and the moving sum should sum only the amounts of the same group.
In summary, a have a table that has the following structure:
id | date | group | amount |
---|---|---|---|
1 | 2022-01 | group A | 1100 |
2 | 2022-01 | group D | 2500 |
3 | 2022-02 | group A | 3000 |
4 | 2022-02 | group B | 1000 |
5 | 2022-02 | group C | 2500 |
6 | 2022-03 | group A | 2000 |
7 | 2022-04 | group C | 1000 |
8 | 2022-05 | group A | 1500 |
9 | 2022-05 | group D | 2000 |
10 | 2022-06 | group B | 1000 |
So, I want to add a moving sum column, containing the sum the ammount for each group for the last 3 months. The sum should not reset every 3 months, but should consider only the previous values from the 3 months prior, and of the same group.
The end result should look like:
id | date | group | amount | moving_sum_three_months |
---|---|---|---|---|
1 | 2022-01 | group A | 1100 | 1100 |
2 | 2022-01 | group D | 2500 | 2500 |
3 | 2022-02 | group A | 3000 | 4100 |
4 | 2022-02 | group B | 1000 | 1000 |
5 | 2022-02 | group C | 2500 | 2500 |
6 | 2022-03 | group A | 2000 | 6100 |
7 | 2022-04 | group C | 1000 | 3500 |
8 | 2022-05 | group A | 1500 | 3500 |
9 | 2022-05 | group D | 2000 | 2000 |
10 | 2022-06 | group B | 1200 | 1200 |
The best example to see how the sum work in this example is line 8.
- It considers only lines 8 and 6 for the sum, because they are the only one that meet the criteria;
- Line 1 and 3 do not meet the criteria, because they are more than 3 months old from line 8 date;
- All the other lines are not from group A, so they are also excluded from the sum.
Any ideias? Thanks in advance for the help!
CodePudding user response:
Use SUM()
as a window function partitioning the window by group
in RANGE
mode. Set the frame to go back 3 months prior the current record using INTERVAL '3 months'
, e.g.
SELECT *, SUM(amount) OVER w AS moving_sum_three_months
FROM t
WINDOW w AS (PARTITION BY "group" ORDER BY "date"
RANGE BETWEEN INTERVAL '3 months' PRECEDING AND CURRENT ROW)
ORDER BY id
Demo: db<>fiddle