i want to calculate rolling average over a rolling period of 252 days but only if 252 days data available in table otherwise null value for rows.
currently i am using this query:
SELECT datestamp, symbol, avg(close) OVER (PARTITION BY symbol ORDER BY datestamp ROWS BETWEEN 251 PRECEDING AND CURRENT ROW) FROM daily_prices
.
it is giving avg also if 252 days data not available. i want acheive result as we get with pandas rolling function by defining min_period value.
CodePudding user response:
"i want to calculate rolling average over a rolling period of 252 days"
The clause ROWS BETWEEN 251 PRECEDING AND CURRENT ROW
doesn't refer to a period of time but to the number of rows in the window and which preceed the current row according to the ORDER BY datestamp
clause.
I would suggest you a slightly different solution for the window function in order to implement the period of time :
SELECT datestamp, symbol, avg(close) OVER (PARTITION BY symbol ORDER BY datestamp RANGE BETWEEN '251 days' PRECEDING AND CURRENT ROW) FROM daily_prices
Then I don't understand in which case you want a null value. In the window of a current row, you will have at least the current row, so that the avg() can't be null.
CodePudding user response:
Just do a count over the same window and use it to modify your results. I used a named window to avoid specifying the same window repeatedly.
with daily_prices as (select 1 as symbol, 5 as close, t as datestamp from generate_series(now()-interval '1 year',now(),interval '1 day') f(t))
SELECT
datestamp,
symbol,
case when count(close) OVER w = 252 then
avg(close) OVER w
end
FROM daily_prices
window w as (PARTITION BY symbol ORDER BY datestamp ROWS BETWEEN 251 PRECEDING AND CURRENT ROW);