Home > Software design >  Creating datasets based on overlapping dates
Creating datasets based on overlapping dates

Time:02-27

I have a dataset which has window start and end dates. For an account there could be multiple cases and each case has separate window start and end dates. I want to create two datasets, one where the cases have overlapping window start and end dates and the other which does not have overlapping cases.

dataset that I have

 input acct_num case window_start window_end ;
    DATALINES; 
    55203610 1  10JAN2020 23FEB2020
    55203610 2  17JAN2020 13MAR2020
    55203610 3  31DEC2019 17MAR2020
    55203612 1  22FEB2021 09JUN2021  
    55203612 2  27AUG2021 31DEC2021
    55203614 1  11FEB2018 21MAR2018
    55203614 2  14MAR2018 23JUL2018
    ;
    run;

Datasets that I want.

overlapping dataset 

    55203610 1  10JAN2020 23FEB2020
    55203610 2  17JAN2020 13MAR2020
    55203610 3  31DEC2019 17MAR2020
    55203614 1  11FEB2018 21MAR2018
    55203614 2  14MAR2018 23JUL2018

Non overlapping dataset

    55203612 1  22FEB2021 09JUN2021  
    55203612 2  27AUG2021 31DEC2021

SO if you see the overlapping datasets has accounts with cases the dates are overlapping.

Any help will be appreciated.

CodePudding user response:

Based on your data set this works.

  • Identify overlaps in initial data set and create a flag
  • Filter data set based on any IDs that have an overlap flag
  • Filter data set based on inverse of any IDs that have an overlap flag

data have;
    infile cards;
    input acct_num case window_start : date9. window_end : date9.;
    format window_: date9.;
    cards;
55203610 1  10JAN2020 23FEB2020
55203610 2  17JAN2020 13MAR2020
55203610 3  31DEC2019 17MAR2020
55203612 1  22FEB2021 09JUN2021  
55203612 2  27AUG2021 31DEC2021
55203614 1  11FEB2018 21MAR2018
55203614 2  14MAR2018 23JUL2018
    ;
run;
data check;
    set have;
    by acct_num;
    prev_start=lag(window_start);
    prev_end=lag(window_end);

    if not first.acct_num and prev_start<window_start<prev_end then
        overlap=1;
run;

proc sql;
    create table want_overlap as select * from have where acct_num in (select 
        distinct acct_num from check where overlap=1);
quit;

proc sql;
    create table want_no_overlap as select * from have where acct_num not 
        in (select distinct acct_num from check where overlap=1);
quit;
  • Related