I need to do a count of values by date and fill in missing calendar dates, as well as, format the date I know how to just count it but creating a whole calendar to plot this on and creating a net total is where I am running into issues.
Example Data
Value | Date Open | Date Closed |
---|---|---|
123456 | 2023-01-26 00:00:00.0000 | 2023-01-31 00:00:00.0000 |
653421 | 2023-02-01 00:00:00.0000 | 2023-02-02 00:00:00.0000 |
987654 | 2023-02-01 00:00:00.0000 | NULL |
Example DDL/DML:
INSERT INTO @Example (Value, DateOpen, DateClosed) VALUES
(123456, '2023-01-26 00:00:00.0000', '2023-01-31 00:00:00.0000'),
(653421, '2023-02-01 00:00:00.0000', '2023-02-02 00:00:00.0000'),
(987654, '2023-02-01 00:00:00.0000', NULL);
Desired Outcome
Calendar Date | Count Open | Count Closed | Net |
---|---|---|---|
2023-01-26 | 1 | 0 | 1 |
2023-01-27 | 0 | 0 | 1 |
2023-01-28 | 0 | 0 | 1 |
2023-01-29 | 0 | 0 | 1 |
2023-01-30 | 0 | 0 | 1 |
2023-01-31 | 0 | 1 | 0 |
2023-02-01 | 2 | 0 | 2 |
2023-02-02 | 0 | 1 | 1 |
DECLARE @MinDate DATE = CAST(DATEADD(YY, -5, getdate()) as DATE),
@MaxDate DATE = CAST(GETDATE() as DATE);
But I would rather have it take the earliest date available in the data
CodePudding user response:
This is a pretty common problem. The perfect solution would see you adding a calendar table to your database/server, but that's not always practical. You can simulate a basic one using a little CTE trick which makes use of the range actually present in your data:
;WITH BaseDates AS (
SELECT MIN(DateOpen) AS Date, CASE WHEN MAX(DateClosed)>MAX(DateOpen) THEN MAX(DateClosed) ELSE MAX(DateOpen) END AS mDate
FROM @Example
UNION ALL
SELECT DATEADD(DAY,1,Date), mDate
FROM BaseDates
WHERE date < mDate
)
Date | mDate |
---|---|
2023-01-26 | 2023-02-02 |
2023-01-27 | 2023-02-02 |
2023-01-28 | 2023-02-02 |
2023-01-29 | 2023-02-02 |
2023-01-30 | 2023-02-02 |
2023-01-31 | 2023-02-02 |
2023-02-01 | 2023-02-02 |
2023-02-02 | 2023-02-02 |
This gives us a complete list of all the dates in the range of your data. We can then use it as the initial table to query to get your required results:
;WITH BaseDates AS (
SELECT MIN(DateOpen) AS Date, CASE WHEN MAX(DateClosed)>MAX(DateOpen) THEN MAX(DateClosed) ELSE MAX(DateOpen) END AS mDate
FROM @Example
UNION ALL
SELECT DATEADD(DAY,1,Date), mDate
FROM BaseDates
WHERE date < mDate
)
SELECT Date, OpenCnt, ClosedCnt, SUM(OpenCnt-ClosedCnt) OVER (ORDER BY Date) AS NetCnt
FROM (
SELECT Date, COUNT(DISTINCT o.Value) AS OpenCnt, COUNT(DISTINCT c.Value) AS ClosedCnt
FROM BaseDates b
LEFT OUTER JOIN @Example o
ON b.Date = o.DateOpen
AND (
b.date < o.DateClosed
OR o.DateClosed IS NULL
)
LEFT OUTER JOIN @Example c
ON b.Date = c.DateClosed
GROUP BY b.Date
) a
Here we use the baseDates, and left outer join to the data on opens, and closeds for that day, counting the distinct values column. We use that as a subquery and apply a windowed sum function over the top to give us a running total, ordered by date.
Date | OpenCnt | ClosedCnt | NetCnt |
---|---|---|---|
2023-01-26 | 1 | 0 | 1 |
2023-01-27 | 0 | 0 | 1 |
2023-01-28 | 0 | 0 | 1 |
2023-01-29 | 0 | 0 | 1 |
2023-01-30 | 0 | 0 | 1 |
2023-01-31 | 0 | 1 | 0 |
2023-02-01 | 2 | 0 | 2 |
2023-02-02 | 0 | 1 | 1 |
CodePudding user response:
Just another option using an ad-hoc tally/numbers table and a conditional aggregation. Then finally, the window function sum() over()
to calculate the running NET.
You may notice I used TOP 1000
for the numbers table. Feel free to adjust to a more reasonable number (but be generous ... leave some wiggle room)
Example
;with cte as (
Select CalendarDate
,CountOpen = sum( case when [Date Open] =CalendarDate then 1 else 0 end)
,CountClosed = sum( case when [Date Closed]=CalendarDate then 1 else 0 end)
From YourTable
Join ( Select Top 1000 N=-1 Row_Number() Over (Order By (Select NULL)) From master..spt_values n1, master..spt_values n2 ) B
on N<=datediff(day,[Date Open],coalesce([Date Closed],[Date Open]))
Cross Apply ( values (dateadd(day,N,[Date Open])) ) C(CalendarDate)
Group By CalendarDate
)
Select *
,Net = sum(CountOpen-CountClosed) over (order by CalendarDate)
From cte
Order by CalendarDate
Results