create time range with 2 columns date

The problem I am facing is how to find distinct time periods from multiple time periods with overlap.

For example, the attached tables contain multiple overlapping time periods, how can I combine those time periods into 3 unique time periods in SQL???

I think I can do it in python with the loop function, but not sure how to do it in SQL

ID	Start Date	End Date
001	2005-01-01	2006-01-01
001	2005-01-01	2007-01-01
001	2008-01-01	2008-06-01
001	2008-04-01	2008-12-01
001	2010-01-01	2010-05-01
001	2010-04-01	2010-12-01
001	2010-11-01	2012-01-01

My expected result is:

ID	start_Date	end_date
001	2005-01-01	2007-01-01
001	2008-01-01	2008-12-01
001	2010-01-01	2012-01-01

CodePudding user response：

This is a gaps and islands problem. Try this:

with u as 
(select *,
case 
when start_date <= lag(end_date) over(partition by ID order by start_date, end_date) then 0 
else 1 end as grp
from table_name),
v as
(select ID, start_date, end_date,
sum(grp) over(partition by ID order by start_date, end_date) as island
from u)
select ID, min(start_date) as start_Date, max(end_date) as end_date
from v
group by ID, island;

Fiddle

Basically you can identify "islands" by comparing start_date of current row to end_date of previous row (ordered by start_date, end_date), if it precedes it then it's the same island. Then you can do a rolling sum() to get the island numbers. Finally select min(start_date) and max(end_date) from each island to get the desired output.

CodePudding user response：

This may work ,with little bit of change in function , I tried it in Dbeaver :

select ID,Start_Date,End_Date
from 
(
select t.*,
dense_rank () over(partition by extract (year from Start_Date) order BY End_Date desc) drnk
from testing_123 t
) temp
where temp.drnk = 1
ORDER BY Start_Date;

CodePudding user response：

Try this

WITH a as (
  SELECT
    ID,
    LEFT(Start_Date, 4) as Year,
    MIN(Start_Date) as New_Start_Date
  FROM
    TAB1
  GROUP BY
    ID,
    LEFT(Start_Date, 4)
), b as (
  SELECT 
    a.ID,
    Year,
    New_Start_Date,
    End_Date
  FROM
    a
  LEFT JOIN
    TAB1
   ON LEFT(a.New_Start_Date, 4) = LEFT(TAB1.Start_Date, 4)
)
select 
  ID,
  New_Start_Date as Start_Date,
  MAX(End_Date)
from 
  b
GROUP BY
  ID,
  New_Start_Date;

Example: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=97f91b68c635aebfb752538cdd752ace