Home > Net >  SQL Window Function over sliding time window
SQL Window Function over sliding time window

Time:08-25

I have the following data:

            country  objectid  objectuse
record_date
2022-07-20    chile         0          4
2022-07-01    chile         1          4
2022-07-02    chile         1          4
2022-07-03    chile         1          4
2022-07-04    chile         1          4
...             ...       ...        ...
2022-07-26     peru      3088          4
2022-07-27     peru      3088          4
2022-07-28     peru      3088          4
2022-07-30     peru      3088          4
2022-07-31     peru      3088          4

The data describes the daily usage of an object within a country for a single month (July 2022), and not all object are used every day. One of the things I am interested in finding is the sum of the monthly maximums for the month:

WITH month_max AS (
    SELECT
        country,
        objectid,
        MAX(objectuse) AS maxuse
    FROM mytable
    GROUP BY
        country,
        objectid
)
SELECT
    country,
    SUM(maxuse)
FROM month_max
GROUP BY country;

Which results in this:

country   sum
-------------
chile    1224
peru    17008   

But what I actually want is to get the rolling sum of the maxima from the beginning of the month up to each date. So that I get something that looks like:

            country       sum  
record_date
2022-07-01    chile         1
2022-07-01     peru         1
2022-07-02    chile         2
2022-07-02     peru         3
...             ...       ...
2022-07-31    chile       1224
2022-07-31     peru      17008

I tried using a window function like this to no avail:

SELECT
    *,
    SUM(objectuse) OVER (
        PARTITION BY country
        ORDER BY record_date ROWS 30 PRECEDING
    ) as cumesum
FROM mytable
order BY cumesum DESC;

Is there a way I can achieve the desired result in SQL?

Thanks in advance.

EDIT: For what it's worth, I asked the same question but on Pandas and I received an answer; perhaps it helps to figure out how to do it in SQL.

CodePudding user response:

We can use SUM() as a window function, with a partition by year and month.

SELECT record_date, country, objectid,
       SUM(objectuse) OVER (PARTITION BY TO_CHAR(record_date, 'YYYY-MM'), country
                            ORDER BY record_date
                            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS sum
FROM mytable
ORDER BY record_date;

CodePudding user response:

WITH month_max AS (
    SELECT country, objectid,
        MAX(objectuse) over (PARTITION BY objectid ORDER BY record_date) AS maxuse
    FROM mytable
)
SELECT
    country,
    SUM(maxuse)
FROM month_max
GROUP BY country;

This does assume one row per object per date.

  • Related