Home > Enterprise >  Filter data where previous row with same ID has value specific values
Filter data where previous row with same ID has value specific values

Time:01-22

app_id status modified_date
1 A 2022-10-23
1 B 2022-10-10
1 null 2022-10-05
2 A 2022-11-10
2 B 2022-11-03
3 A 2022-12-15

I need to get all app_ids with status A that had value B or null in previous rows, rows that have only status A or other should be ignored.

Expected output of the above sample input table is app_id 1 and 2.

I've tried with subquery in the FROM clause, but not sure if it's the correct way to do it (and it's probably impossible to filter that in that way correctly).

CodePudding user response:

Use the window function LAG() to check the previous status of each row, but because LAG() returns NULL when there is no previous row, which is a valid value for your requirement, you should also count the previous rows with window function COUNT() to make sure that there is a previous row:

WITH cte AS (
    SELECT t.*, 
           LAG(t.status) OVER(PARTITION BY t.app_id ORDER BY t.modified_date) prev_status,
           COUNT(*) OVER(PARTITION BY t.app_id ORDER BY t.modified_date) cnt
    FROM tablename t
)
SELECT app_id
FROM cte
WHERE status = 'A' AND cnt > 1 AND (prev_status = 'B' OR prev_status IS NULL);

See the demo.

CodePudding user response:

This is how I understood the problem.

Sample data:

SQL> with test (app_id, status, modified_date) as
  2    (select 1, 'A', date '2022-10-23' from dual union all
  3     select 1, 'B', date '2022-10-10' from dual union all
  4     select 1, null,date '2022-10-05' from dual union all
  5     --
  6     select 2, 'A', date '2022-11-10' from dual union all
  7     select 2, 'B', date '2022-11-03' from dual union all
  8     --
  9     select 3, 'A', date '2022-12-15' from dual union all
 10     --
 11     select 4, 'A', date '2022-12-28' from dual union all
 12     select 4, 'C', date '2022-12-27' from dual union all
 13     --
 14     select 5, 'B', date '2023-01-13' from dual union all
 15     select 5, 'A', date '2023-01-08' from dual
 16    ),

Query begins here. last_a CTE ranks rows per app_id by date value in descending order, which means that rn = 1 represents the last (current) status for that particular app_id.

 17  last_a as
 18    (select app_id, status, modified_date,
 19       row_number() over (partition by app_id order by modified_date desc) rn
 20     from test
 21    )

Finally, self-join last_a and look for rows whose current status is A (line #24) and rows that aren't current (rn > 1) and have B or null in status column.

 22  select distinct a.app_id
 23  from last_a a join last_a b on a.app_id = b.app_id
 24  where (a.rn = 1 and  a.status = 'A')
 25    and (b.rn > 1 and (b.status = 'B' or b.status is null));

    APP_ID
----------
         1
         2

SQL>
  • Related