Home > database >  SQL keep rows that include change in a subset of the columns
SQL keep rows that include change in a subset of the columns

Time:12-15

I have the following table

var1 time var2
val1 1 a
val1 2 a
val1 3 a
val1 4 a
val2 5 a
val1 6 a
val3 1 b
val4 2 b
val4 3 b
val5 4 b

And I wish to remove all the rows with repeating values in var1 and var2 keeping changes in at least one of them. Sorry for the unclear explanation, instead, here is an example of how the desired table will be:

var1 time var2
val1 1 a
val2 5 a
val1 6 a
val3 1 b
val4 2 b
val5 4 b

Lastly, is there a way to know how much space this table will take before executing the query?

EDIT: fixing variables' names in the tables

CodePudding user response:

This is a gaps and islands problem, you could try the difference between two row numbers to solve it as the following:

WITH define_groups AS
(
  SELECT *,
    ROW_NUMBER() OVER (PARTITION BY var2 ORDER BY time) -
    ROW_NUMBER() OVER (PARTITION BY var1, var2 ORDER BY time) grp
  FROM table_name
),
row_numbering AS
(
  SELECT *,
    ROW_NUMBER() OVER (PARTITION BY var2, grp ORDER BY time) rn
  FROM define_groups
)
SELECT var1, time, var2
FROM row_numbering
WHERE rn = 1
ORDER BY var2, time

See demo.

  • Related