I have a table with the following structure:
ID | StartDate (YY-MM-DD) | EndDate (YY-MM-DD) |
---|---|---|
1 | 20-07-13 | 21-05-12 |
2 | 15-04-12 | 27-01-01 |
3 | 14-01-30 | 18-12-30 |
4 | 18-02-21 | |
5 | 20-12-12 | 20-12-15 |
6 | 20-11-11 | |
7 | 19-10-22 | 20-08-10 |
8 | 17-09-01 | 19-04-15 |
9 | 18-06-27 | 21-08-19 |
9 | 19-08-17 |
I would like to write a query that creates an output with the following structure: Count the IDs which are active during each period. For example ID: 1 was active between 13. July 2020 and 12. May 2021, so it should be added to the count during 202007 and 202105. As long as there is now EndDate, the ID should be added to the count till the present period.
COUNT_ID | YYYYMM |
---|---|
150 | 201601 |
200 | 201602 |
180 | 201603 |
... | ... |
... | ... |
... | ... |
I struggle to find a way to calculate the count of active IDs for each period, I believe there is an simple way to do that but unfortunately I don´t know.
Any help/tip/hint is much appreciated!
Best regards, Frederik
CodePudding user response:
If I'm understanding correctly, you will want to:
Generate a table of month-start-and-end ranges, between a start date (June 2016) and present
Then, for each month, examine how many of your records' start and end dates contain any days that fall within that month
CREATE TABLE #temp_ids (id INT,
id_start DATE,
id_end DATE);
INSERT INTO #temp_ids (id, id_start, id_end) Values (1,'20200713','20210512');
INSERT INTO #temp_ids (id, id_start, id_end) Values (2,'20150412','20270101');
INSERT INTO #temp_ids (id, id_start, id_end) Values (3,'20140130','20181230');
INSERT INTO #temp_ids (id, id_start) Values (4,'20180221');
INSERT INTO #temp_ids (id, id_start, id_end) Values (5,'20201212','20201215');
INSERT INTO #temp_ids (id, id_start) Values (6,'20201111');
INSERT INTO #temp_ids (id, id_start, id_end) Values (7,'20191022','20200810');
INSERT INTO #temp_ids (id, id_start, id_end) Values (8,'20170901','20190415');
INSERT INTO #temp_ids (id, id_start, id_end) Values (9,'20180627','20210809');
INSERT INTO #temp_ids (id, id_start) Values (10,'20190817');
DECLARE @windowStart DATE = '20160101',
@windowEnd DATE = GETDATE();
;WITH report_dates AS (--This is a convenient piece of code to generate a table of dates between two dates
SELECT
TOP (DATEDIFF(DAY, @windowStart, @windowEnd) 1)
report_date = DATEADD(DAY,
ROW_NUMBER() OVER(ORDER BY a.object_id) - 1,
@windowStart)
FROM sys.all_objects AS a
CROSS JOIN sys.all_objects AS b
),
month_windows AS (--Filter the dates we just generated to only month-start, and add month-end
SELECT report_date AS month_start,
EOMONTH(report_date) AS month_end
FROM report_dates
WHERE DATEPART(DAY, report_date) = 1
)
SELECT --Count the number of records which overlap with the month range
DISTINCT month_start,
month_end,
COUNT(id) AS count_id
FROM #temp_ids AS ti
JOIN month_windows mw ON ((ti.id_start <= mw.month_start AND ti.id_end >= mw.month_start)
OR (ti.id_start >= mw.month_start AND ti.id_start <= mw.month_end)
)
WHERE id_end IS NOT NULL
GROUP BY month_start,
month_end