Home > Enterprise >  Snowflake SQL: trying to calculate time difference between subsets of subsequent rows
Snowflake SQL: trying to calculate time difference between subsets of subsequent rows

Time:07-12

I have some data like the following in a Snowflake database

DEVICE_SERIAL REASON_CODE VERSION MESSAGE_CREATED_AT NEXT_REASON_CODE
BA1254862158 1 4 2022-06-23 02:06:03 4
BA1254862158 4 4 2022-06-23 02:07:07 1
BA1110001111 1 5 2022-06-16 16:19:04 4
BA1110001111 4 5 2022-06-16 17:43:04 1
BA1110001111 5 5 2022-06-20 14:37:45 4
BA1110001111 4 5 2022-06-20 17:31:12 1

that's the result of a previous query. I'm trying to get the difference between message_created_at timestamps where the device_serial is the same between subsequent rows, and the first row (of the pair for the difference) has reason_code of 1 or 5, and the second row of the pair has reason_code 4.

For this example, my desired output would be

DEVICE_SERIAL VERSION DELTA_SECONDS
BA1254862158 4 64
BA1110001111 5 5040
BA1110001111 5 10407

It's easy to calculate the time difference between every pair of rows (just lead or lag datediff). But I'm not sure how to structure a query to select only the desired rows so that I can get a datediff between them, without calculating spurious datediffs.

My ultimate goal is to see how these datediffs change between versions. I am but a lowly C programmer, my SQL-fu is weak.

CodePudding user response:

with data as (
    select *,
        count(case when reason_code in (1, 5) then 1 end)
            over (partition by device_serial order by message_created_at) as grp
        /* or alternately bracket by the end code */
        -- count(case when reason_code = 4 then 1 end)
            -- over (partition by device_serial order by message_created_at desc) as grp

    from T
)
select device_serial, min(version) as version,
    datediff(second, min(message_created_at), max(message_created_at)) as delta_seconds
from data
group by device_serial, grp
  • Related