Home > Net >  SQL - map rows based on date window and condition
SQL - map rows based on date window and condition

Time:02-26

I asked a very similar question here and now trying to handle an edge case that came up. In the original problem, the dates worked out nicely in that associated cases had matching open_date/close_date pairs. But in the example below, case4 and case3 are supposed to be linked but case3 closed after case4 opened. We are able to tell that these two cases should be linked because case3 cannot be a final case in a sequence because it has the status deferred and its close_date is in close proximity to when the next deferred case was opened. I'm wondering if there is a good way to consider this condition when matching the cases by open_date/close_date?

case_id open_date  close_date user_id type     status      
case5   2021-06-01 2021-08-25  user1   request  complete
case4   2021-05-05 2021-06-01  user1   request  deferred
case3   2021-03-01 2021-05-12  user1   request  deferred
case2   2020-09-15 2021-03-01  user1   request  deferred
case1   2020-09-01 2020-09-15  user1   request  deferred

CodePudding user response:

The solution is just a tiny adjustment (based on the original solution provided by @MikhailBerlyant) to allow / detect overlap of successive cases.

The fiddle (with original and new problem)

Notice the inequality: open_date > lead(close_date)

with your_table (case_id, open_date, close_date, user_id, type) as (
  select 'case5', '2021-06-01', '2021-08-25', 'user1', 'request' union all
  select 'case4', '2021-05-05', '2021-06-01', 'user1', 'request' union all
  select 'case3', '2021-03-01', '2021-05-12', 'user1', 'request' union all
  select 'case2', '2020-09-15', '2021-03-01', 'user1', 'request' union all 
  select 'case1', '2020-09-01', '2020-09-15', 'user1', 'request' 
)
select *, 
  case row_number() over(partition by user_id, type, map_id order by open_date) 
    when 1 then 'new case'
    when count(1) over(partition by user_id, type, map_id) then 'last deferred case'
    else 'deferred case'
  end as status
from (
  select *, 
    SUM(new_case) over(partition by user_id, type order by open_date) as map_id
  from (
    select *, 
      COALESCE(open_date > lead(close_date) over(partition by user_id, type order by open_date desc), 1) new_case
    from your_table
  ) AS t1
) AS t2
ORDER BY open_date DESC
;

The result:

case_id open_date close_date user_id type new_case map_id status
case5 2021-06-01 2021-08-25 user1 request 0 1 last deferred case
case4 2021-05-05 2021-06-01 user1 request 0 1 deferred case
case3 2021-03-01 2021-05-12 user1 request 0 1 deferred case
case2 2020-09-15 2021-03-01 user1 request 0 1 deferred case
case1 2020-09-01 2020-09-15 user1 request 1 1 new case

CodePudding user response:

You will have to make a small adjustment to your BigQuery code on the inner FROM statement:

with your_table as ( 
  select 'case5' as case_id, '2021-06-01' as open_date, '2021-08-25' as close_date, 'user1' as user_id, 'request' as type union all
  select 'case4', '2021-05-05', '2021-06-01', 'user1', 'request' union all
  select 'case3', '2021-03-01', '2021-05-12', 'user1', 'request' union all
  select 'case2', '2020-09-15', '2021-03-01', 'user1', 'request' union all 
  select 'case1', '2020-09-01', '2020-09-15', 'user1', 'request' 
)
select *, 
  case row_number() over(partition by user_id, type, map_id order by open_date) 
    when 1 then 'new case'
    when count(1) over(partition by user_id, type, map_id) then 'last deferred case'
    else 'deferred case'
  end as status
from (
  select * except(new_case), 
    countif(new_case) over(partition by user_id, type order by open_date) as map_id 
  from (
    select *, 
      case when open_date <=  lead(close_date) over(partition by user_id, type order by open_date desc) then false
           when open_date != lead(close_date) over(partition by user_id, type order by open_date desc) then true
           else False end new_case
    from your_table
  )
)

I have tested with old scenario and new scenario and a combined one. This should address your issue. I consult the following documentation for it:

  • Related