I have a following timeseries data in bigquery:
Timestamp EventID Value
2022-03-05 06:00:10 1 NULL
2022-03-05 06:00:11 3 5
2022-03-05 06:00:12 2 3
2022-03-05 06:00:13 3 2
2022-03-05 06:00:14 4 1
2022-03-05 06:00:17 2 6
2022-03-05 06:00:17 5 NULL
2022-03-05 06:30:15 1 NULL
2022-03-05 06:30:18 3 2
2022-03-05 06:31:22 2 4
2022-03-05 06:35:56 3 3
2022-03-05 06:36:18 4 1
2022-03-05 06:39:17 2 2
2022-03-05 06:50:24 5 NULL
I want to write a SQL query to sum values having the same eventID between eventId 1 and eventId 5. The output will look like that:
StartEventId EndEventId StartTime EndTime total_2 total_3 total_4
1 5 2022-03-05 06:00:10 2022-03-05 06:00:17 9 7 1
1 5 2022-03-05 06:30:15 2022-03-05 06:50:24 6 5 1
total_2 is the sum of values having eventId equal 2 between eventID 1 and 5. The same for total_3 and total_4.
Thanks for your help!
CodePudding user response:
Consider below approach
select min(EventID) StartEventId, max(EventID) EndEventId,
min(Timestamp) StartTime, max(Timestamp) EndTime,
sum(if(EventID = 2, Value, null)) total_2,
sum(if(EventID = 3, Value, null)) total_3,
sum(if(EventID = 4, Value, null)) total_4
from (
select *, div(countif(EventID in (1,5)) over win - 1, 2) grp
from your_table
window win as (order by Timestamp, EventID)
)
group by grp
if applied to sample data in your question - output is