Home > OS >  Removing the current row from a table if the column value of the previous row is the same Snowflake
Removing the current row from a table if the column value of the previous row is the same Snowflake

Time:08-31

I have a table as follows:

ID ACTIVE_STATUS DATE
45 TRUE 2022-06-12
45 TRUE 2022-06-13
45 FALSE 2022-07-01
36 TRUE 2022-08-01
36 FALSE 2022-08-02
36 FALSE 2022-08-14
36 TRUE 2022-08-15
14 TRUE 2022-03-25
14 TRUE 2022-03-28
14 TRUE 2022-03-29

I would like to remove rows from the table where within each ID group, if the current ACTIVE_STATUS value is the same as the value in the previous row, then remove the current row (Basically I am keeping the rows where the ACTIVE_STATUS shows a change for each ID group).

For example for ID 45, the active status was TRUE on 2022-06-12 and stayed TRUE until it became FALSE on 2022-07-01 so I would delete the row where the status is TRUE for date 2022-06-13 since there is no change in status between that and the previous row. I currently have the data ordered by DATE per ID group. I would like the output to look like

ID ACTIVE_STATUS DATE
45 TRUE 2022-06-12
45 FALSE 2022-07-01
36 TRUE 2022-08-01
36 FALSE 2022-08-02
36 TRUE 2022-08-15
14 TRUE 2022-03-25

I currently have:

SELECT ID, ACTIVE_STATUS, DATE 
FROM MY_TABLE 
GROUP BY ID, ACTIVE_STATUS, DATE   
ORDER BY DATE;

But I am not sure how to use lag() to achieve this or a partition? Any help would be great!

CodePudding user response:

Using enter image description here

Before filtering:

strong text

CodePudding user response:

Here's how you would use lag. You can keep the lag in the select to see what it's doing. Qualify is to Window Function what Having is to Group By.

select *
from your_table
qualify lag(active_status) over(partition by id order by date) <> active_status or
        lag(active_status) over(partition by id order by date) is null

If your team is familiar with null-safe equality operators, you could use is distinct from and simplify that to

select *
from your_table
qualify lag(active_status) over(partition by id order by date) is distinct from active_status
  • Related