Home > Software design >  SQL: Merge Overlapping Date Ranges
SQL: Merge Overlapping Date Ranges

Time:01-02

I have two tables both containing date ranges, the first table contains the default record ID that applies during the date range:

STARTDATE          | ENDDATE            | RECORDID
__________________________________________________
2022/Nov/01 00:00  | 2022/Nov/30 00:00  | 10
2022/Dec/01 00:00  | 2022/Dec/31 00:00  | 16

The second table contains the override record id that overrides the default record ID for the date range specified if present, if there isnt an override for the date in question then the default record represents that date range:

STARTDATE          | ENDDATE            | RECORDID
__________________________________________________
2022/Nov/14 00:00  | 2022/Nov/16 00:00  | 12
2022/Dec/06 00:00  | 2022/Dec/20 00:00  | 18

The override table records always fall within the dates covered by dates in the default table.

The end result for the above scenario should look like this:

STARTDATE          | ENDDATE            | RECORDID
__________________________________________________
2022/Nov/01 00:00  | 2022/Nov/14 00:00  | 10
2022/Nov/14 00:00  | 2022/Nov/16 00:00  | 12
2022/Nov/16 00:00  | 2022/Nov/30 00:00  | 10
2022/Dec/01 00:00  | 2022/Dec/06 00:00  | 16
2022/Dec/06 00:00  | 2022/Dec/20 00:00  | 18
2022/Dec/20 00:00  | 2022/Dec/31 00:00  | 16

Is there a way to do this with SQL Server 2019? I made an attempt but it was filled with clumsy loops and the end result was incorrect and would not have been efficient at all.

CodePudding user response:

Try the following query, maybe there is a simpler way to do it, but the following works.

/*
left join the two tables to get the overlapped date ranges.
the left join is used in case there is no default date range
in the default table that overlapped with the override table date range.
*/
with overlapped_ranges as
(
  select D.StartDate def_stdt, D.EndDate def_endt, D.RecordId def_rid, 
         O.StartDate ovr_stdt, O.EndDate ovr_endt, O.RecordId ovr_rid
  from override_records O left join default_records D  
  on O.StartDate <= D.EndDate and O.EndDate >= D.StartDate
),
/* 
from the joined tables, get all of the date range boundaries as values
using cross-apply. you could run (select * from get_all_ranges_boundaries)
to see the output of this query.
*/
get_all_ranges_boundaries as
(
  select OV.*, dt
  from overlapped_ranges OV
  cross apply 
  (
    select * 
    from (values (def_stdt), (def_endt), (ovr_stdt), (ovr_endt)) as T(dt)
  ) all_dates_boundaries
),
/*
now we have all the date range boundaries, so the start date, end date values
will be the date boundary as the start date and the next date boundary as the end date.
i.e. 
Values('2022-11-01','2022-11-30','2022-11-14','2022-11-16') will be
('2022-11-01'-'2022-11-14'),('2022-11-14'-'2022-11-16'),('2022-11-16'-'2022-11-30')
*/
stDate_enDate as
(
  select dt StartDate, 
         lead(dt) over (partition by def_stdt, def_endt, ovr_stdt, ovr_endt order by dt) EndDate,
         case 
          when dt >= ovr_stdt and lead(dt) over (partition by def_stdt, def_endt, ovr_stdt, ovr_endt order by dt) <= ovr_endt 
          then ovr_rid else def_rid 
         end as RecordId -- check if the populated date range occurs within the override date range, if yes select override id, else select default id.
  from get_all_ranges_boundaries
)

select StartDate, EndDate, RecordId
from stDate_enDate
where EndDate is not null and -- the last date range boundary has no lead value, this is to exclude the last row for each overlapped pair.
      StartDate is not null and  -- exclude null values where an override range does not overlap with default ranges. (from the left join query) 
      StartDate <> EndDate -- exclude rows where StartDate = EndDate i.e. ('2022-11-01',**'2022-11-05','2022-11-05'**,'2022-11-16')
order by StartDate, EndDate

See demo.

  • Related