Home > Mobile >  Flag rows that appear between rows with specific strings
Flag rows that appear between rows with specific strings

Time:10-18

Let's say I have a table like this:

user_id order action
1 1 start
1 2 other
1 3 other
1 4 end
1 5 other
2 1 start
2 2 other
2 3 end
2 4 other
2 5 start
2 6 other
2 7 end

And I want to create a new column that flags the rows that appear between "start" and "end" events for each user (ordering by "order"):

user_id order action is_between_start_and_end
1 1 start NULL
1 2 other 1
1 3 other 1
1 4 end NULL
1 5 other NULL
2 1 start NULL
2 2 other 1
2 3 end NULL
2 4 other NULL
2 5 start NULL
2 6 other 1
2 7 end NULL

How can I achieve this?

CodePudding user response:

This can be solved with windows functions.

with tbl as (
Select  1   as user_id, 1   as order_it,"start" as action
Union all select    1   ,   2   ,"other"
Union all select    1   ,   3   ,"other"
Union all select    1   ,   4   ,"end"
Union all select    1   ,   5   ,"other"
Union all select    2   ,   1   ,"start"
Union all select    2   ,   2   ,"other"
Union all select    2   ,   3   ,"end"
Union all select    2   ,   4   ,"other"
Union all select    2   ,   5   ,"start"
Union all select    2   ,   6   ,"other"
Union all select    2   ,   7   ,"end"
),
helper as (
Select *, 
countif(action="end") over win_before as ends,
countif(action="start") over win_before as starts,
first_value(if(action="end" or action="start",action,null) ignore nulls) over (partition by user_id order by order_it rows between current row and unbounded following) as end_to_come
 from tbl
 window win_before as (partition by user_id order by order_it rows between unbounded preceding and current row)
 order by user_id,order_it
 )
 select *,
 if(end_to_come="end" and starts-ends=1,1,null) as is_between_start_and_end
  from helper
 order by user_id,order_it

CodePudding user response:

This should work but could surely be more optimized

with input as (
select 1 user_id,   1 as order_,    'start' action union all 
select 1,   2,  'other' union all 
select 1,   3,  'other' union all 
select 1 ,  4 , 'end' union all 
select 1 ,  5 , 'other' union all 
select 2 ,  1 , 'start' union all 
select 2 ,  2 , 'other' union all 
select 2 ,  3 , 'end' union all 
select 2 ,  4 , 'other' union all 
select 2 ,  5 , 'start' union all 
select 2 ,  6 , 'other' union all 
select 2 ,  7 , 'end' 
)

select 
  *, 
  if (
      order_ > max(if(action = 'start', order_, null)) 
        over(partition by user_id order by order_ range  between unbounded preceding and current row) and 
      order_ < min(if(action = 'end', order_, null)) 
        over(partition by user_id order by order_ range between current row and unbounded following) and 
      coalesce(order_ not between 
        max(if(action = 'end', order_, null)) 
        over(partition by user_id order by order_ range  between unbounded preceding and 1 preceding)
        and min(if(action = 'start', order_, null)) 
        over(partition by user_id order by order_ range  between 1 following and unbounded following), true)
    , 1, null) as flag
 from input 
 order by 1,2

Edit: It should also take into account weird cases where for instance a 3rd user has end > other > start > other > end > other in that order. The flag should only apply to the 4th item. If you have start > other > start > other > end however, it's unclear if items 2,3,4 or 4 or 2,4 should be flagged. I think it would only flag 4 here

  • Related