Home > Software design >  How to get cumulative values and reset it under a condition, and then aggregate it again on Postgres
How to get cumulative values and reset it under a condition, and then aggregate it again on Postgres

Time:12-20

I am trying to get the column cumulative_point like below.

date point reset_date cumulative_point
2019-01-01 00:00:00 1 true 1
2019-01-02 00:00:00 3 false 4
2019-01-03 00:00:00 1 false 5
2019-01-04 00:00:00 2 false 7
2019-01-05 00:00:00 1 true 1
2019-01-06 00:00:00 4 false 5
2019-01-07 00:00:00 2 false 7

cumulative_point is a cumulative value of point column, then reset it when reset_date column is true and doesn't take over the previous cumulative value.

I have been trying to do this for a while but I couldn't figure out how to reset cumulative_point when the reset_date column is true.

I did like

SUM(CASE WHEN reset_date then 0 ELSE point END) as cumulative_point

but then the SUM function added values to the previous value like below.

date point reset_date cumulative_point
2019-01-01 00:00:00 1 true 1
2019-01-02 00:00:00 3 false 4
2019-01-03 00:00:00 1 false 5
2019-01-04 00:00:00 2 false 7
2019-01-05 00:00:00 1 true 1
2019-01-06 00:00:00 4 false 12
2019-01-07 00:00:00 2 false 14

How can I reset the cumulative values by sum() when reset_date is true and ignore the previous cumulative values?

Thank you!

CodePudding user response:

dbfiddle

create a view:

CREATE VIEW test1 AS
SELECT
    *
    , CASE WHEN reset_date THEN
        0
    ELSE
        point
    END AS reset_point
    , lead((
        CASE WHEN reset_date THEN
            0
        ELSE
            point
        END) , 1 , 0) OVER (ORDER BY date)
    , lag((
        CASE WHEN reset_date THEN
            0
        ELSE
            point
        END) , 1 , 0) OVER (ORDER BY date)
FROM
    cumulative_point
ORDER BY
    date;

query:

WITH cte AS (
    SELECT
        *,
        CASE WHEN lead > 0
            AND lag > 0
            AND reset_point = 0 THEN
            1 --sequence false true false.
        WHEN lead = 0
            AND lag > 0
            AND reset_point = 0 THEN
            1 --- sequence false true true.
        WHEN lead > 0
            AND lag = 0
            AND reset_point = 0 THEN
            1 --- sequence true true false
        WHEN lead = 0
            AND lag = 0
            AND reset_point = 0 THEN
            1 -- true true true
        END AS grp
    FROM
        pg_temp.test1
),
cte1 AS (
    SELECT
        *,
        count(grp) OVER (ORDER BY date) AS real_grp
FROM
    cte
)
SELECT
    date,
    point,
    reset_point,
    reset_date,
    reset_point,
    sum(reset_point) OVER (PARTITION BY cte1.real_grp ORDER BY date)
FROM
    cte1;

basic idea is reset_date is boolean. I added more test case. Hope covered all the corner case. using window function can solve the problem.

CodePudding user response:

You may use the running sum of 'reset_date' (1 where true, 0 where false) to define a new group whenever a true value is shown, then use the cumulative sum of 'point' partitioned by that group.

Select date, point, reset_date,
  SUM(point) Over (Partition By grp Order By date) As cumulative_point
From
(
  Select *, 
   SUM(Case When reset_date Then 1 Else 0 End) Over (Order By date) As grp
  From table_name
) T
Order By date

See demo.

  • Related