Home > Net >  write a query to calculate cumulative performance based on daily percent change in postgresql?
write a query to calculate cumulative performance based on daily percent change in postgresql?

Time:01-14

I have daily change in a table like below.

Table: performance

date percent_change
2022/12/01 2
2022/12/02 -1
2022/12/02 3

I want to assume initial value as 100. and shows cumulative value till date, like below.

expected output:

date percent_change cumulative value
2022/12/01 2 102
2022/12/02 -1 100.98
2022/12/02 3 104.0094

CodePudding user response:

I assume that date in 3rd row is 2022/12/03. Otherwise you need to add an id or some other column to have order on percent changes that occurred in the same day.

Solution

To calculate value after percent_change, you need to multiply your current value by (100 percent_change) / 100

For day n cumulative value is 100 multiplied by product of coefficients (100 percent_change) / 100 up to day n.

In PostgreSQL "up to day n" can be implemented with window functions.

Since there is no aggregate function for multiplication, lets create it.

CREATE AGGREGATE PRODUCT(DOUBLE PRECISION) (
    SFUNC = float8mul,
    STYPE = FLOAT8
);

Final query will look like this:

SELECT
    date,
    percent_change,
    100 * product((100   percent_change)::float / 100) OVER (ORDER BY date) cumulative_value
FROM performance;

CodePudding user response:

A product of values, like the one you want to make, is nothing more than EXP(SUM(LN(...))). It results in a slightly verbose query but does not require new functions to be coded and can be ported as is to other DBMS.

In your case, as long as none of your percentages is below -100%:

SELECT date,
       percent_change,
       100 * EXP(SUM(LN(1 percent_change/100)) OVER (ORDER BY Date)) AS cumulative_value
FROM T

The SUM(...) OVER (ORDER BY ...) is what makes it a cumulative sum.

If you need to account for percentages lower than -100%, you need a bit more complexity.

SELECT date,
       percent_change,
       100 * -1 ^ SUM(CASE WHEN percent_change < -100 THEN 1 ELSE 0 END) OVER (ORDER BY Date) 
           * EXP(SUM(LN(ABS(1 percent_change/100))) OVER (ORDER BY Date))
       AS cumulative_value
FROM T
WHERE NOT EXISTS (SELECT FROM T T2 WHERE T2.percent_change = -100 AND T2.date <= T.date)
UNION ALL
SELECT Date, percent_change, 0
FROM T
WHERE EXISTS (SELECT FROM T T2 WHERE T2.percent_change = -100 AND T2.date <= T.date)

Explanation:

  1. An ABS(...) has been added to account for the values not supported in the previous query. It effectively strips the sign of 1 percentage_value / 100
  2. Before the EXP(SUM(LN(ABS(...)))), the -1 ^ SUM(...) is where the sign is put back to the calculation. Read it as: -1 to the power of how many times we encountered a negative value.
  3. The part WHERE EXISTS(...) / WHERE NOT EXISTS(...) handles the special case of percentage_value = -100%. When we encounter -100, we cannot calculate the logarithm even with a call to ABS(...).
    However, this does not matter much as the products you want to calculate are going to be 0 from this point onward.

Side note:
You can save yourself some of the complexity of the above queries by changing how you store the changes.

  • Storing 0.02 to represent 2% removes the multiplications/divisions by 100.
  • Storing 0.0198026272961797 (LN(1 0.02)) removes the need to call for a logarithm in your query.
  • Related