I have a table of subject records where each subject as 1 or more rows that have a begin/end date along with an index date:
SUBJECT | BEGIN_DATE | END_DATE | INQUIRY_DATE |
---|---|---|---|
1 | 1988-01-01 | 2010-04-05 | 2022-05-06 |
1 | 2010-04-06 | 2022-10-02 | 2022-05-06 |
2 | 1996-09-24 | 2005-08-08 | 2022-10-01 |
2 | 2016-11-21 | 2022-04-04 | 2022-10-01 |
3 | 2005-01-01 | 2021-02-12 | 2022-03-21 |
4 | 1999-12-31 | 2015-07-16 | 2022-08-15 |
4 | 2015-07-20 | 2020-04-01 | 2022-08-15 |
4 | 2020-12-31 | 2022-10-01 | 2022-08-15 |
What I want to find out is, for each subject, are the 3 months prior to the inquiry date covered continuously by the begin/end ranges (with no more than a 14-day gap between the end date and the next start date)? For example subject 1: they have continuous observation between 1988-01-01 and 2022-10-02, so yes this subject's inquiry date has 3 months covered prior; whereas, subject 2 does not.
Working in SQL (snowflake) and started with something like this (obviously not complete, just not sure how to get to where I want):
with lookback as (
select *, INQUIRY_DATE - interval '3 months' as look_back_3m from tbl)
select *, case when diff >= 14 then 1 else 0 end as flag from (
select SUBJECT, BEGIN_DATE - lag(END_DATE) over(partition by subject order by BEGIN_DATE) as diff from tbl) z
CodePudding user response:
The first step is that you want to filter out any rows, that end prior to 4 months prior to your INQUIRY_DATE or start 1 month after your INQUIRY_DATE, as they are outside the 3 month 14 day rules you have defined.
So some enhanced data:
with your_table(subject, begin_date, end_date, inquiry_date) as (
select * from values
(1, '1988-01-01'::date, '2010-04-05'::date, '2022-05-06'::date),
(1, '2010-04-06'::date, '2022-10-02'::date, '2022-05-06'::date),
(2, '1996-09-24'::date, '2005-08-08'::date, '2022-10-01'::date),
(2, '2016-11-21'::date, '2022-04-04'::date, '2022-10-01'::date),
(3, '2005-01-01'::date, '2021-02-12'::date, '2022-03-21'::date),
(4, '1999-12-31'::date, '2015-07-16'::date, '2022-08-15'::date),
(4, '2015-07-20'::date, '2020-04-01'::date, '2022-08-15'::date),
(4, '2020-12-31'::date, '2022-10-01'::date, '2022-08-15'::date),
(5, '2022-05-01'::date, '2022-06-01'::date, '2022-08-15'::date),
(5, '2022-06-01'::date, '2022-07-01'::date, '2022-08-15'::date),
(5, '2022-07-01'::date, '2022-08-01'::date, '2022-08-15'::date),
(5, '2022-08-01'::date, '2022-10-01'::date, '2022-08-15'::date),
(6, '2022-05-01'::date, '2022-05-10'::date, '2022-08-15'::date),
(6, '2022-06-01'::date, '2022-06-10'::date, '2022-08-15'::date),
(6, '2022-07-01'::date, '2022-07-10'::date, '2022-08-15'::date),
(6, '2022-08-01'::date, '2022-10-01'::date, '2022-08-15'::date)
)
select *
from your_table
where begin_date <= inquiry_date
and end_date >= dateadd(month, -4, inquiry_date);
gives:
SUBJECT | BEGIN_DATE | END_DATE | INQUIRY_DATE |
---|---|---|---|
1 | 2010-04-06 | 2022-10-02 | 2022-05-06 |
4 | 2020-12-31 | 2022-10-01 | 2022-08-15 |
5 | 2022-05-01 | 2022-06-01 | 2022-08-15 |
5 | 2022-06-01 | 2022-07-01 | 2022-08-15 |
5 | 2022-07-01 | 2022-08-01 | 2022-08-15 |
5 | 2022-08-01 | 2022-10-01 | 2022-08-15 |
6 | 2022-05-01 | 2022-05-10 | 2022-08-15 |
6 | 2022-06-01 | 2022-06-10 | 2022-08-15 |
6 | 2022-07-01 | 2022-07-10 | 2022-08-15 |
6 | 2022-08-01 | 2022-10-01 | 2022-08-15 |
Then you want to find chains of gaps and make them contentious.
something like:
select *
,lag(end_date) over(partition by subject order by begin_date) as prior_end
,datediff(days, prior_end, begin_date) as days_of_gap
,(begin_date - 14) <= prior_end as prior_overlapped
,iff(prior_overlapped, null, begin_date) as group_start_date
from your_table
where begin_date <= inquiry_date
and end_date >= dateadd(month, -4, inquiry_date);
Then you want to ask, is any chain start prior/on INQUIRY_DATE - 3 months and ends on/after INQUIRY_DATE.
select
subject
,begin_date
,end_date
,group_date
,dateadd(month, -3, inquiry_date) as req_start_date
,datediff(days, group_date, end_date) as days_of_streak
,end_date >= inquiry_date as rule_1
,group_date <= dateadd(month, -3, inquiry_date) as rule_2
from (
select *
,lag(group_start_date) ignore nulls over (partition by subject order by begin_date) as l_g_d
,nvl(group_start_date, l_g_d) as group_date
from (
select *
,lag(end_date) over(partition by subject order by begin_date) as prior_end
,iff((begin_date - 14) <= prior_end, null, begin_date) as group_start_date
from your_table
where begin_date <= inquiry_date
and end_date >= dateadd(month, -4, inquiry_date)
)
)
order by subject, begin_date
;
gives:
SUBJECT | BEGIN_DATE | END_DATE | GROUP_DATE | REQ_START_DATE | DAYS_OF_STREAK | RULE_1 | RULE_2 |
---|---|---|---|---|---|---|---|
1 | 2010-04-06 | 2022-10-02 | 2010-04-06 | 2022-02-06 | 4,562 | TRUE | TRUE |
4 | 2020-12-31 | 2022-10-01 | 2020-12-31 | 2022-05-15 | 639 | TRUE | TRUE |
5 | 2022-05-01 | 2022-06-01 | 2022-05-01 | 2022-05-15 | 31 | FALSE | TRUE |
5 | 2022-06-01 | 2022-07-01 | 2022-05-01 | 2022-05-15 | 61 | FALSE | TRUE |
5 | 2022-07-01 | 2022-08-01 | 2022-05-01 | 2022-05-15 | 92 | FALSE | TRUE |
5 | 2022-08-01 | 2022-10-01 | 2022-05-01 | 2022-05-15 | 153 | TRUE | TRUE |
6 | 2022-05-01 | 2022-05-10 | 2022-05-01 | 2022-05-15 | 9 | FALSE | TRUE |
6 | 2022-06-01 | 2022-06-10 | 2022-06-01 | 2022-05-15 | 9 | FALSE | FALSE |
6 | 2022-07-01 | 2022-07-10 | 2022-07-01 | 2022-05-15 | 9 | FALSE | FALSE |
6 | 2022-08-01 | 2022-10-01 | 2022-08-01 | 2022-05-15 | 61 | TRUE | FALSE |
and we want rows that both rules are true..
thus:
select
subject
,group_date
,inquiry_date
from (
select *
,lag(group_start_date) ignore nulls over (partition by subject order by begin_date) as l_g_d
,nvl(group_start_date, l_g_d) as group_date
from (
select *
,lag(end_date) over(partition by subject order by begin_date) as prior_end
,iff((begin_date - 14) <= prior_end, null, begin_date) as group_start_date
from your_table
where begin_date <= inquiry_date
and end_date >= dateadd(month, -4, inquiry_date)
)
)
where end_date >= inquiry_date
and group_date <= dateadd(month, -3, inquiry_date)
order by subject, begin_date
;
gives:
SUBJECT | GROUP_DATE | INQUIRY_DATE |
---|---|---|
1 | 2010-04-06 | 2022-05-06 |
4 | 2020-12-31 | 2022-08-15 |
5 | 2022-05-01 | 2022-08-15 |