I'm looking for a way to find the maximum concurrent capacity of an institution (hospital) in terms of the number of studies it can run parallelly.
Even if there is 1 day overlap, the studies are considered to be overlapping. In the below data, there are 2 batches of overlaps at hospital "I1"- in the first batch there are 4 studies overlapping and in the second there are 2 studies overlapping. In summary, the maximum concurrent capacity of I1 is 4 (meaning it can handle 4 studies parallelly).
Can you help/guide with a efficient SQL for this?
Script to create test data is available below. Note: INST_ID is the hospital id.
CREATE TABLE TEST_INST_DT(INST_ID VARCHAR2(10), STUDY_ID VARCHAR2(10), STUDY_START_DATE DATE, STUDY_END_DATE DATE);
-- Overlap (4 studies)
INSERT INTO TEST_INST_DT VALUES('I1', 'S1', TO_DATE('31-DEC-2021', 'DD-MON-YYYY'), TO_DATE('02-JAN-2022', 'DD-MON-YYYY'));
INSERT INTO TEST_INST_DT VALUES('I1', 'S1', TO_DATE('01-JAN-2022', 'DD-MON-YYYY'), TO_DATE('05-JAN-2022', 'DD-MON-YYYY'));
INSERT INTO TEST_INST_DT VALUES('I1', 'S2', TO_DATE('02-JAN-2022', 'DD-MON-YYYY'), TO_DATE('03-JAN-2022', 'DD-MON-YYYY'));
INSERT INTO TEST_INST_DT VALUES('I1', 'S3', TO_DATE('04-JAN-2022', 'DD-MON-YYYY'), TO_DATE('10-JAN-2022', 'DD-MON-YYYY'));
-- Overlap (2 studies)
INSERT INTO TEST_INST_DT VALUES('I1', 'S4', TO_DATE('01-FEB-2022', 'DD-MON-YYYY'), TO_DATE('05-FEB-2022', 'DD-MON-YYYY'));
INSERT INTO TEST_INST_DT VALUES('I1', 'S5', TO_DATE('02-FEB-2022', 'DD-MON-YYYY'), TO_DATE('03-FEB-2022', 'DD-MON-YYYY'));
CodePudding user response:
With MATCH_RECOGNIZE
select * from test_inst_dt
match_recognize (
partition by inst_id
order by study_start_date, study_end_date
measures first(study_start_date) as study_start_date, max(study_end_date) as study_end_date, count(*) as nstudies
pattern( merged* strt )
define
merged as max(study_end_date) > next(study_start_date)
);
CodePudding user response:
Sorry, didn't pay attention to 11g criteria, then try
with rtest_inst_dt(rn, inst_id,study_id,study_start_date,study_end_date) as (
select row_number() over(partition by inst_id order by study_start_date,study_end_date),
inst_id,study_id, study_start_date,study_end_date
from test_inst_dt
)
,cte(rn, root, inst_id,study_id,study_start_date,study_end_date) as (
select rn, rn, inst_id,study_id,study_start_date,study_end_date
from rtest_inst_dt d1
where not exists(
select 1 from rtest_inst_dt d2
where d2.rn < d1.rn and
(
d2.study_end_date between d1.study_start_date and d1.study_end_date
or d2.study_start_date between d1.study_start_date and d1.study_end_date
or (d2.study_start_date < d1.study_start_date and d2.study_end_date > d1.study_end_date)
)
)
union all
select d1.rn, c.root, d1.inst_id, d1.study_id, least(c.study_start_date, d1.study_start_date),
greatest(c.study_end_date, d1.study_end_date)
from cte c
join rtest_inst_dt d1 on d1.rn > c.rn
and d1.study_start_date between c.study_start_date and c.study_end_date
)
select inst_id, min(study_start_date) as study_start_date, max(study_end_date) as study_end_date, count(distinct rn) as n
from cte
group by inst_id, root
;
I1 31/12/21 10/01/22 4
I1 01/02/22 05/02/22 2