Home > Software engineering >  Aggregation on a window function with minimum windows size
Aggregation on a window function with minimum windows size

Time:10-27

Assuming that I have a table that collects ordered records:

SELECT * FROM items_table ORDER BY day ASC
day total_items
1 5
2 6
3 3
4 1
5 2
6 2
7 4

I wish to calculate the maximum number of "total_items" for every 3 days, while the window is at least of size 3. If at a specific row the window size is not big enough the returned value should be NULL.

day min_3_days_total_items
1 NULL
2 NULL
3 6
4 6
5 3
6 2
7 4

in Pandas it would be:

df['total_items'].rolling(3, min_periods=3).max()

How can this be done in BigQuery SQL?

CodePudding user response:

case
  when count(*) over (order by day rows between 2 preceding and current row) = 3
  then max(total_items) over (order by day rows between 2 preceding and current row) end

You could similarly base your test on row_number() or lag() if the nulls are only because they appear at the beginning of the list. I guess that depends on how you want to think about the data.

CodePudding user response:

Consider below approach

select *,
  if(count(*) over recent_3_days = 3, 
    max(total_items) over recent_3_days, 
    null) as max_items_in_recent_min_3_days
from items_table  
window recent_3_days as (order by day range between 2 preceding and current row)
order by day
  • Related