I have a requirement to calculate active users in a given a month and year by the user type. I have date range for each user with start date and end date. A user with an open ended date is an active user and has to be counted from the beginning of time till current date (year month).
USER_ID | USER_TYPE | START_DATE | END_DATE |
---|---|---|---|
3673246 | INTERNAL | 1/1/2022 | 12/31/2999 |
1813148 | INTERNAL | 1/1/2022 | 12/31/2999 |
407508 | INTERNAL | 1/1/2022 | 3/1/2022 |
836682 | EXTERNAL | 2/1/2022 | 12/31/2999 |
1107520 | EXTERNAL | 2/1/2022 | 6/1/2022 |
2289472 | INTERNAL | 2/1/2022 | 12/31/2999 |
1833644 | EXTERNAL | 3/1/2022 | 12/31/2999 |
EXPECTED RESULT
YEAR_MONTH | USER_TYPE | ACTIVE_USER_COUNT |
---|---|---|
202201 | INTERNAL | 2 |
202201 | EXTERNAL | 0 |
202202 | INTERNAL | 2 |
202202 | EXTERNAL | 2 |
202203 | INTERNAL | 1 |
202203 | EXTERNAL | 2 |
202204 | INTERNAL | 3 |
202204 | EXTERNAL | 2 |
CodePudding user response:
I believe this example query is what you are trying to do. However, I don't agree with your expected results. For example, in January 2022, you have an active count of 2 for "Internal", but, even adding up the users by hand, I see 3 that were active in January. Test it out yourself:
--Create table variable to hold test data.
DECLARE @testData TABLE (
[user_id] int
, user_type nvarchar(50)
, [start_date] datetime
, end_date datetime
);
--Load up some test data.
INSERT INTO @testData ([user_id], user_type, [start_date], end_date)
VALUES
(3673246,'INTERNAL','1/1/2022','12/31/2999')
,(1813148,'INTERNAL','1/1/2022','12/31/2999')
,(407508,'INTERNAL','1/1/2022','3/1/2022')
,(836682,'EXTERNAL','2/1/2022','12/31/2999')
,(1107520,'EXTERNAL','2/1/2022','6/1/2022')
,(2289472,'INTERNAL','2/1/2022','12/31/2999')
,(1833644,'EXTERNAL','3/1/2022','12/31/2999')
;
--First CTE "x" is used to create a sequence of 10 numbers.
WITH x as (
SELECT * FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) as x(a)
)
--Second CTE "y" creates a sequence of 1000 numbers.
, y as (
SELECT ROW_NUMBER() OVER(ORDER BY hundreds.a, tens.a, ones.a) as row_num
FROM x as ones, x as tens, x as hundreds
)
--Third CTE "months" creates a sequence of months going back in time from today.
--To go farther back than 1000 months, modify the "y" CTE to have a "thousands" (or more) table(s).
, months as (
SELECT
DATEADD(month, -1 * y.row_num, GETDATE()) as [date]
, CAST(YEAR(DATEADD(month, -1 * y.row_num, GETDATE())) as nvarchar(6))
RIGHT('00' CAST(MONTH(DATEADD(month, -1 * y.row_num, GETDATE())) as nvarchar(6)),2) as YEAR_MONTH
FROM y
)
SELECT
m.YEAR_MONTH
, utype.user_type
, COUNT(td.[user_id]) as active_user_count
FROM months as m
--This join makes sure there is a 0 qty record for every month even
--if there are no active users in the month for this user_type.
JOIN (SELECT DISTINCT user_type FROM @testData) as utype
ON 1=1
--This joins our test data to the 2 tables above.
--We make sure the month in m is after the users' start dates
--and before the users' end dates.
LEFT OUTER JOIN @testData as td
ON td.user_type = utype.user_type
AND td.[start_date] <= CAST(m.[date] as date)
AND td.end_date >= CAST(m.[date] as date)
WHERE m.[date] >= '1/1/2022 00:00:00' --Limiting results to just 2022
GROUP BY m.YEAR_MONTH
, utype.user_type
ORDER BY m.YEAR_MONTH, utype.user_type