Let's say I have a table like this:
user_id | order | action |
---|---|---|
1 | 1 | start |
1 | 2 | other |
1 | 3 | other |
1 | 4 | end |
1 | 5 | other |
2 | 1 | start |
2 | 2 | other |
2 | 3 | end |
2 | 4 | other |
2 | 5 | start |
2 | 6 | other |
2 | 7 | end |
And I want to create a new column that flags the rows that appear between "start" and "end" events for each user (ordering by "order"):
user_id | order | action | is_between_start_and_end |
---|---|---|---|
1 | 1 | start | NULL |
1 | 2 | other | 1 |
1 | 3 | other | 1 |
1 | 4 | end | NULL |
1 | 5 | other | NULL |
2 | 1 | start | NULL |
2 | 2 | other | 1 |
2 | 3 | end | NULL |
2 | 4 | other | NULL |
2 | 5 | start | NULL |
2 | 6 | other | 1 |
2 | 7 | end | NULL |
How can I achieve this?
CodePudding user response:
This can be solved with windows functions.
with tbl as (
Select 1 as user_id, 1 as order_it,"start" as action
Union all select 1 , 2 ,"other"
Union all select 1 , 3 ,"other"
Union all select 1 , 4 ,"end"
Union all select 1 , 5 ,"other"
Union all select 2 , 1 ,"start"
Union all select 2 , 2 ,"other"
Union all select 2 , 3 ,"end"
Union all select 2 , 4 ,"other"
Union all select 2 , 5 ,"start"
Union all select 2 , 6 ,"other"
Union all select 2 , 7 ,"end"
),
helper as (
Select *,
countif(action="end") over win_before as ends,
countif(action="start") over win_before as starts,
first_value(if(action="end" or action="start",action,null) ignore nulls) over (partition by user_id order by order_it rows between current row and unbounded following) as end_to_come
from tbl
window win_before as (partition by user_id order by order_it rows between unbounded preceding and current row)
order by user_id,order_it
)
select *,
if(end_to_come="end" and starts-ends=1,1,null) as is_between_start_and_end
from helper
order by user_id,order_it
CodePudding user response:
This should work but could surely be more optimized
with input as (
select 1 user_id, 1 as order_, 'start' action union all
select 1, 2, 'other' union all
select 1, 3, 'other' union all
select 1 , 4 , 'end' union all
select 1 , 5 , 'other' union all
select 2 , 1 , 'start' union all
select 2 , 2 , 'other' union all
select 2 , 3 , 'end' union all
select 2 , 4 , 'other' union all
select 2 , 5 , 'start' union all
select 2 , 6 , 'other' union all
select 2 , 7 , 'end'
)
select
*,
if (
order_ > max(if(action = 'start', order_, null))
over(partition by user_id order by order_ range between unbounded preceding and current row) and
order_ < min(if(action = 'end', order_, null))
over(partition by user_id order by order_ range between current row and unbounded following) and
coalesce(order_ not between
max(if(action = 'end', order_, null))
over(partition by user_id order by order_ range between unbounded preceding and 1 preceding)
and min(if(action = 'start', order_, null))
over(partition by user_id order by order_ range between 1 following and unbounded following), true)
, 1, null) as flag
from input
order by 1,2
Edit: It should also take into account weird cases where for instance a 3rd user has end > other > start > other > end > other in that order. The flag should only apply to the 4th item. If you have start > other > start > other > end however, it's unclear if items 2,3,4 or 4 or 2,4 should be flagged. I think it would only flag 4 here