Home > Software design >  Postgres - Select next unique value in time-series data
Postgres - Select next unique value in time-series data

Time:04-30

In a table I have time-series data that looks like the following:

t_stamp status_val
2022-04-21 8:00 AM 0
2022-04-21 8:01 AM 0
2022-04-21 8:02 AM 3
2022-04-21 8:03 AM 1
2022-04-21 8:04 AM 1
2022-04-21 8:05 AM 0

The example above is very simplified. In actuality the timestamps are every second and the status_val may be the same for 100s of rows before seeing a different value.

Trying to write a query that selects the first row & status_val and then the next row is when the status_val actually changes. So the output should be this:

t_stamp status_val
2022-04-21 8:00 AM 0
2022-04-21 8:02 AM 3
2022-04-21 8:03 AM 1
2022-04-21 8:05 AM 0

Almost like using a LEAD(status_val,1) <> status_val in the WHERE clause...but that obviously doesn't work in WHERE clauses.

I've tried using group by and distinct but the results were not the desired output. Not an expert with postgres so please be gentle :)

CodePudding user response:

Use LAG() to determine which rows to keep from inside a CTE:

with mark_repeats as (
  select t_stamp, status_val,
         coalesce(
           status_val = lag(status_val) over (order by t_stamp),
           false
         ) as is_repeat
    from your_table
)
select t_stamp, status_val
  from mark_repeats
 where not is_repeat;

Working example

  • Related