Home > Net >  How to detect streaks of continuous activity in Postgres?
How to detect streaks of continuous activity in Postgres?

Time:12-01

I want to aggregate data based on streaks of continuous activity.

DDL:

CREATE TABLE t_series (t date, data int)
INSERT INTO t_series VALUES
(date '2018-03-01',12),
(date '2018-03-02',43),
(date '2018-03-03',9),
(date '2018-03-04',13),
(date '2018-03-09',23),
(date '2018-03-10',26),
(date '2018-03-11',28),
(date '2018-03-14',21),
(date '2018-03-15',15)

I want an intermediate output as:

          t | data | period 
------------ ------ ------
 2018-03-01 |   12 | 1
 2018-03-02 |   43 | 1
 2018-03-03 |    9 | 1
 2018-03-04 |   13 | 1
 2018-03-09 |   23 | 2
 2018-03-10 |   26 | 2
 2018-03-11 |   28 | 2
 2018-03-14 |   21 | 3
 2018-03-15 |   15 | 3

And the final output as:

period  | sum
-------- -----
      1 | 77
      2 | 77
      3 | 36

I have tried using below but doesn't seem to work:

SELECT *, SUM(CASE WHEN diff IS NULL
                     OR diff <2 THEN 1 ELSE NULL END) OVER (ORDER BY t) AS period
       FROM (SELECT *, t - lag(t, 1) OVER (ORDER BY t) AS diff
             FROM   t_series
       ) AS x;

Could anyone please suggest a fix.

Thanks in advance.

CodePudding user response:

I came up with this solution:

SELECT period, SUM(data) AS sum
FROM (
  SELECT t, data, SUM(groups) OVER (ORDER BY t) AS period
  FROM (
    SELECT t, data,
      CASE
        WHEN diff IS NULL OR diff = 1 THEN 0
        ELSE 1
      END AS groups
    FROM (
      SELECT t, data, t - LAG(t) OVER (ORDER BY t) AS diff
      FROM t_series
    ) d
  ) g -- your intermediate output
) p
GROUP BY period
ORDER BY period
;

Result:

 period | sum 
-------- -----
      0 |  77
      1 |  77
      2 |  36

The only difference is that my period starts with 0, but I think it's ok

  • Related