Home > Net >  Eliminate overlapping date ranges
Eliminate overlapping date ranges

Time:12-14

My data in an Oracle table is like this. I need a solution in Oracle SQL

StDt EdDt User Stat
20-12-2021 12-06-2022 A
16-06-2022 31-12-4712 A
09-06-2022 30-06-2022 B

OUTPUT :-

StDt EdDt
20-12-2021 31-12-4712

This output is because the person was active throughout the time till 31-12-4712.

Another Scenario :-

StDt EdDt User Stat
20-12-2021 31-12-4712 A
09-06-2022 30-06-2022 B

Output :-

StDt EdDt
20-12-2021 31-12-4712

Another Scenario :-

StDt EdDt User Stat
20-12-2021 12-06-2022 A
16-06-2022 25-06-2022 A
20-06-2022 30-06-2022 B
10-10-2022 31-03-2023 B

Output :-

StDt EdDt
20-12-2021 12-06-2022
16-06-2022 30-06-2022
10-10-2022 31-03-2022

So in short we have to remove the overlapping date range here.

CodePudding user response:

This is a classical job for MATCH_RECOGNIZE, a general pattern:

MATCH_RECOGNIZE (
     PARTITION BY userstat
     ORDER BY stdt, eddt
     MEASURES FIRST(stdt) AS stdt, MAX(eddt) as eddt
     PATTERN( merged* start )
     DEFINE
        merged AS MAX(eddt) >= NEXT(stdt)
)

CodePudding user response:

You can use a MERGE statement with MATCH_RECOGNIZE:

MERGE INTO table_name dst
USING (
  SELECT ROWID AS rid,
         rn,
         MAX(eddt) OVER (PARTITION BY user_stat, mno) AS eddt
  FROM   table_name
  MATCH_RECOGNIZE(
    PARTITION BY user_stat
    ORDER BY StDt
    MEASURES
      COUNT(*) AS rn,
      MATCH_NUMBER() AS mno
    ALL ROWS PER MATCH
    PATTERN (overlapping* final_row)
    DEFINE
      overlapping AS MAX(eddt) >= NEXT(stdt)
  )
) src
ON (dst.ROWID = src.rid)
WHEN MATCHED THEN
  UPDATE
  SET   eddt = src.eddt
  DELETE WHERE rn > 1;

Which, for the sample data:

CREATE TABLE table_name (StDt, EdDt, User_Stat) AS
SELECT DATE '2021-12-20', DATE '2022-06-12', 'A' FROM DUAL UNION ALL
SELECT DATE '2022-06-16', DATE '4712-12-31', 'A' FROM DUAL UNION ALL
SELECT DATE '2022-06-09', DATE '2022-06-30', 'B' FROM DUAL UNION ALL
SELECT DATE '2022-06-09', DATE '2022-06-30', 'C' FROM DUAL UNION ALL
SELECT DATE '2022-06-15', DATE '2022-06-20', 'C' FROM DUAL UNION ALL
SELECT DATE '2022-06-15', DATE '2022-06-20', 'D' FROM DUAL UNION ALL
SELECT DATE '2022-06-18', DATE '2022-06-23', 'D' FROM DUAL UNION ALL
SELECT DATE '2022-06-25', DATE '2022-06-30', 'D' FROM DUAL;

Then, after the MERGE statement the table contains:

STDT EDDT USER_STAT
2021-12-20 00:00:00 2022-06-12 00:00:00 A
2022-06-16 00:00:00 4712-12-31 00:00:00 A
2022-06-09 00:00:00 2022-06-30 00:00:00 B
2022-06-09 00:00:00 2022-06-30 00:00:00 C
2022-06-15 00:00:00 2022-06-23 00:00:00 D
2022-06-25 00:00:00 2022-06-30 00:00:00 D

fiddle

  • Related