I have data like this
Start End
2022-01-01 2022-01-31
2022-02-01 2022-02-28
2022-03-05 2022-04-05
2022-05-01 2022-05-31
2022-06-01 2022-06-30
2022-07-01 2022-07-15
I want to find all continuous periods (where there is no more than 1 day between the start and the end) An example of the desired result
2022-01-01 2022-02-28
2022-03-05 2022-04-05
2022-05-01 2022-07-15
CodePudding user response:
There is the need to combine the value between rows. LEAD
and LAG
are these window
functions. With these the comparision of the start date with the previous end date is possible. The same is done for the end date with the next start date. Thus we obain two columns A
and B_
which have only an entrie for breaks.
We need to filter all rows where A
and B_
have both nulls, qualify
helps here to prevent a further select
statement. The A
and B_
dates are now split often over two rows, therefore the end date in column B
is put in the above row, if this one is null: ifnull(B_,lead(B_) over (order by start))
with tbl as
(
Select date("2022-01-01") as start,date("2022-01-31") as ending
union all Select date("2022-02-01"),date("2022-02-28")
union all Select date("2022-03-05"),date("2022-04-05")
union all Select date("2022-05-01"),date("2022-05-31")
union all Select date("2022-06-01"),date("2022-06-30")
union all Select date("2022-07-01"),date("2022-07-15")
)
Select A, ifnull(B_,lead(B_) over (order by start)) B
from
(
Select *,
if(ifnull(date_diff(start,lag(ending) over (order by Start),day),99)>1,start,null) A,
if(ifnull(date_diff(ending,lead(start) over (order by Start),day),-99)<-1,ending,null) B_
from tbl
Qualify A is not null or B_ is not null
#order by Start
)
Qualify A is not null
order by A