Home > OS >  How do you call previous row in a where clause?
How do you call previous row in a where clause?

Time:02-22

I am trying to figure out how to get rid of results that occur close together. For example the rows have a create timestamp (source_time). I want to remove results that occur within 10 seconds of each other.

I thought lag() might do it, but I can't use that in the where clause.

select *
from table
where source_time - previous(source_time) >= 10 second

Very rough code, but I am not sure how to call the previous source time. I have translated them to timestamps and used timestamp_diff(source_time, x, second) >= 10 but not sure how to make x the previous value.

Hopefully this is clear.

CodePudding user response:

You can do this with subqueries.

delete table t1
where t1.id in (
  select t2.id
  from (
    select
      id,
      source_time - lag(source_time) over (order by source_time) as time_diff
    from table
  ) t2
  where t2.time_diff < 10 second
)

Keep in mind this can potentially leave large gaps in your records if. For example, if you get a row every 9 seconds for an hour you'll delete all but the last record in that hour.

You might instead partition the source_time every 10 seconds and delete anything with a row_number > 1.

delete table t1
where t1.id in (
  select t2.id
  from (
    select
      id,
      source_time,
      row_number() over(
        partition by source_time - make_interval(second => extract(second from source_time) % 10)
        order by source_time asc
      ) rownum
    from table
  ) t2
  where rownum > 1
)
  • Related