Home > Software engineering >  PostgreSQL: Merge times into intervals based on another column alternating value
PostgreSQL: Merge times into intervals based on another column alternating value

Time:07-08

I have an event data structure with timestamp and action for each device.

I`m trying to merge intervals together to get the time ranges, when the device is OFF while also merging overlapping intervals.

I do have a code which does this on my server and I can port it to PL/pgSQL. But I was wondering if this is achievable using SQL.

I know how to merge overlapping intervals when I have them, but I'm not sure how to eliminate the duplicate OFF or ON actions following each other, so that I could easily build the OFF ranges.

Example input:

           time          |    action     |
------------------------- --------------- 
2022-07-09 16:36:05.990  |      OFF
2022-07-10 09:01:28.263  |      ON
2022-07-10 12:40:42.759  |      OFF
2022-07-10 13:21:12.208  |      ON
2022-07-10 13:21:12.261  |      OFF
2022-07-10 13:21:23.225  |      ON
2022-07-10 16:47:10.393  |      OFF
2022-07-11 13:45:09.033  |      OFF
2022-07-11 13:45:10.439  |      OFF
2022-07-11 13:45:10.441  |      OFF
2022-07-11 13:48:26.640  |      OFF
2022-07-11 13:48:28.070  |      OFF
2022-07-11 13:49:37.502  |      OFF
2022-07-11 13:49:37.503  |      OFF
2022-07-11 13:49:38.892  |      OFF
2022-07-11 13:49:38.895  |      OFF
2022-07-11 15:57:09.478  |      OFF
2022-07-11 15:58:00.995  |      ON
2022-07-11 15:58:01.048  |      OFF
2022-07-11 15:58:12.055  |      ON

Desired output (range can be in 2 columns)

           range                                   |    state     |
--------------------------------------------------- --------------- 
2022-07-09 16:36:05.990 - 2022-07-10 09:01:28.263  |      OFF
2022-07-10 12:40:42.759 - 2022-07-10 13:21:12.208  |      OFF
2022-07-10 13:21:12.261 - 2022-07-10 13:21:23.225  |      OFF
2022-07-10 16:47:10.393 - 2022-07-11 15:58:00.995  |      OFF
2022-07-11 15:58:01.048 - 2022-07-11 15:58:12.055  |      OFF

CodePudding user response:

This is a type of gaps and islands problem. We can use the difference in row numbers method here, along with LEAD() to find to the ending ON timestamp for each OFF range.

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (ORDER BY time) rn1,
              ROW_NUMBER() OVER (PARTITION BY action ORDER BY time) rn2,
              LEAD(time) OVER (ORDER BY time) lead_time
    FROM yourTable
)

SELECT MIN(time)::text || ' - ' || MAX(lead_time)::text AS range,
       'OFF' AS state
FROM cte
WHERE action = 'OFF'
GROUP BY rn1 - rn2
ORDER BY MIN(time);

screen capture from demo link below

Demo

  • Related