For the table:
date | phase | user_id |
---|---|---|
1.1.20 | a | 10 |
2.1.20 | a | 10 |
3.1.20 | b | 10 |
4.1.20 | a | 10 |
5.1.20 | a | 10 |
6.1.20 | b | 10 |
I need to return for each user_id the start date and end date for each phase when a contains all the phases in the middle and b is the end phase, without using window functions. Should look like this:
user_id | start_date | end_date |
---|---|---|
10 | 1.1.20 | 3.1.20 |
10 | 4.1.20 | 6.1.20 |
CodePudding user response:
with cte1 as(
select user_id,date,
case
when phase = 'a' and (lag(phase)over(order by date) is null or lag(phase)over(order by date) = 'b')
then 1
else 0
end grp
from tb),
cte2 as(
select user_id,date, sum(grp)over(order by date) as rnk
from cte1)
select user_id,min(date)start_date,max(date)end_date
from cte2
group by user_id,rnk
CodePudding user response:
Without using window functions.
You can do a self-join to the table.
Then calculate a rank/group for the phases.
Then it's simple to group by the user and the phase group.
select user_id , min([date]) as start_date , max([date]) as end_date from ( select t1.user_id, t1.[date], t1.phase , count(t2.phase) as grp from your_table t1 left join your_table t2 on t2.user_id = t1.user_id and t2.phase = 'b' and t2.[date] >= t1.[date] group by t1.user_id, t1.[date], t1.phase ) q group by user_id, grp order by user_id, start_date
user_id | start_date | end_date |
---|---|---|
10 | 2020-01-01 | 2020-01-03 |
10 | 2020-01-04 | 2020-01-06 |
Test on db<>fiddle here