Home > database >  Rolling sum previous 12 months per month (SQL- Snowflake)
Rolling sum previous 12 months per month (SQL- Snowflake)

Time:07-30

I have the following table structure:

date New Customers
01.04.21 4
01.05.21 1
01.06.21 2
01.07.21 6
01.08.21 3
01.09.21 2
01.10.21 3
01.11.21 8
01.12.21 3
01.01.22 4
01.02.22 0
01.03.22 3
01.04.22 3
01.05.22 2
01.06.22 3
01.07.22 1
01.08.22 3
01.09.22 2
01.10.22 3
01.11.22 1
01.12.22 6
01.01.23 8
01.02.23 4

I am struggling in Snowflake with a query that should show me the the sum of previous 12 months for every distinct month in the table.

I am trying to create a result set that looks like:

Reporting Date Customer #
01.03.22 39
01.04.22 38
01.05.22 39
01.06.22 40
01.07.22 35
01.08.22 35
01.09.22 35
01.10.22 35
01.11.22 28
01.12.22 31
01.01.23 35
01.02.23 39

The way reporting date 01.03.2022 is calculated: it is the sum of past 12 months row of "new customers" columns = 4 1 2 6 3 2 3 8 4 0 3= 39

CodePudding user response:

Seems to work - but you need an additional nested query to build the running sum and a filter value to remove the first 12 months:

WITH
-- your input, don't use in final query ..
indata(dt,New_Customers) AS (
          SELECT DATE '2021-04-01',4 UNION ALL SELECT DATE '2021-05-01',1
UNION ALL SELECT DATE '2021-06-01',2 UNION ALL SELECT DATE '2021-07-01',6
UNION ALL SELECT DATE '2021-08-01',3 UNION ALL SELECT DATE '2021-09-01',2
UNION ALL SELECT DATE '2021-10-01',3 UNION ALL SELECT DATE '2021-11-01',8
UNION ALL SELECT DATE '2021-12-01',3 UNION ALL SELECT DATE '2022-01-01',4
UNION ALL SELECT DATE '2022-02-01',0 UNION ALL SELECT DATE '2022-03-01',3
UNION ALL SELECT DATE '2022-04-01',3 UNION ALL SELECT DATE '2022-05-01',2
UNION ALL SELECT DATE '2022-06-01',3 UNION ALL SELECT DATE '2022-07-01',1
UNION ALL SELECT DATE '2022-08-01',3 UNION ALL SELECT DATE '2022-09-01',2
UNION ALL SELECT DATE '2022-10-01',3 UNION ALL SELECT DATE '2022-11-01',1
UNION ALL SELECT DATE '2022-12-01',6 UNION ALL SELECT DATE '2023-01-01',8
UNION ALL SELECT DATE '2023-02-01',4
)
-- real query starts here, replace following comma with "WITH"
,
-- need to build the row number and the running sum first in a subselect ...
olap AS (
  SELECT
    dt
  , SUM(new_customers) OVER(ORDER BY dt ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS running_cust_count
  , ROW_NUMBER()       OVER(ORDER BY dt                                          ) AS rownum
  FROM indata
) 
SELECT
  dt AS rep_date
, running_cust_count
FROM olap
WHERE rownum >= 12
;
-- out   rep_date  | running_cust_count 
-- out ------------ --------------------
-- out  2022-03-01 |                 39
-- out  2022-04-01 |                 38
-- out  2022-05-01 |                 39
-- out  2022-06-01 |                 40
-- out  2022-07-01 |                 35
-- out  2022-08-01 |                 35
-- out  2022-09-01 |                 35
-- out  2022-10-01 |                 35
-- out  2022-11-01 |                 28
-- out  2022-12-01 |                 31
-- out  2023-01-01 |                 35
-- out  2023-02-01 |                 39

CodePudding user response:

You can try a window function to create aggregates:

select date, SUM(customers) OVER (ORDER BY date
                                  ROWS BETWEEN 11 PRECEDING AND CURRENT ROW)
from table
group by date, customers
order by 1;

This is Postgres, but if I remember well this should work in Snowflake too.

  • Related