Assuming that I have a table that collects ordered records:
SELECT * FROM items_table ORDER BY day ASC
day | total_items |
---|---|
1 | 5 |
2 | 6 |
3 | 3 |
4 | 1 |
5 | 2 |
6 | 2 |
7 | 4 |
I wish to calculate the maximum number of "total_items" for every 3 days, while the window is at least of size 3. If at a specific row the window size is not big enough the returned value should be NULL.
day | min_3_days_total_items |
---|---|
1 | NULL |
2 | NULL |
3 | 6 |
4 | 6 |
5 | 3 |
6 | 2 |
7 | 4 |
in Pandas it would be:
df['total_items'].rolling(3, min_periods=3).max()
How can this be done in BigQuery SQL?
CodePudding user response:
case
when count(*) over (order by day rows between 2 preceding and current row) = 3
then max(total_items) over (order by day rows between 2 preceding and current row) end
You could similarly base your test on row_number()
or lag()
if the nulls are only because they appear at the beginning of the list. I guess that depends on how you want to think about the data.
CodePudding user response:
Consider below approach
select *,
if(count(*) over recent_3_days = 3,
max(total_items) over recent_3_days,
null) as max_items_in_recent_min_3_days
from items_table
window recent_3_days as (order by day range between 2 preceding and current row)
order by day