Home > database >  SQL Calendar and Count of Values
SQL Calendar and Count of Values

Time:02-02

I need to do a count of values by date and fill in missing calendar dates, as well as, format the date I know how to just count it but creating a whole calendar to plot this on and creating a net total is where I am running into issues.

Example Data

Value Date Open Date Closed
123456 2023-01-26 00:00:00.0000 2023-01-31 00:00:00.0000
653421 2023-02-01 00:00:00.0000 2023-02-02 00:00:00.0000
987654 2023-02-01 00:00:00.0000 NULL

Example DDL/DML:

INSERT INTO @Example (Value, DateOpen, DateClosed) VALUES
(123456, '2023-01-26 00:00:00.0000', '2023-01-31 00:00:00.0000'),
(653421, '2023-02-01 00:00:00.0000', '2023-02-02 00:00:00.0000'),
(987654, '2023-02-01 00:00:00.0000', NULL);

Desired Outcome

Calendar Date Count Open Count Closed Net
2023-01-26 1 0 1
2023-01-27 0 0 1
2023-01-28 0 0 1
2023-01-29 0 0 1
2023-01-30 0 0 1
2023-01-31 0 1 0
2023-02-01 2 0 2
2023-02-02 0 1 1
DECLARE @MinDate DATE = CAST(DATEADD(YY, -5, getdate()) as DATE),  
        @MaxDate DATE =  CAST(GETDATE() as DATE);

But I would rather have it take the earliest date available in the data

CodePudding user response:

This is a pretty common problem. The perfect solution would see you adding a calendar table to your database/server, but that's not always practical. You can simulate a basic one using a little CTE trick which makes use of the range actually present in your data:

;WITH BaseDates AS (
SELECT MIN(DateOpen) AS Date, CASE WHEN MAX(DateClosed)>MAX(DateOpen) THEN MAX(DateClosed) ELSE MAX(DateOpen) END AS mDate
  FROM @Example
UNION ALL
SELECT DATEADD(DAY,1,Date), mDate
  FROM BaseDates
 WHERE date < mDate
)
Date mDate
2023-01-26 2023-02-02
2023-01-27 2023-02-02
2023-01-28 2023-02-02
2023-01-29 2023-02-02
2023-01-30 2023-02-02
2023-01-31 2023-02-02
2023-02-01 2023-02-02
2023-02-02 2023-02-02

This gives us a complete list of all the dates in the range of your data. We can then use it as the initial table to query to get your required results:

;WITH BaseDates AS (
SELECT MIN(DateOpen) AS Date, CASE WHEN MAX(DateClosed)>MAX(DateOpen) THEN MAX(DateClosed) ELSE MAX(DateOpen) END AS mDate
  FROM @Example
UNION ALL
SELECT DATEADD(DAY,1,Date), mDate
  FROM BaseDates
 WHERE date < mDate
)

SELECT Date, OpenCnt, ClosedCnt, SUM(OpenCnt-ClosedCnt) OVER (ORDER BY Date) AS NetCnt
  FROM (
SELECT Date, COUNT(DISTINCT o.Value) AS OpenCnt, COUNT(DISTINCT c.Value) AS ClosedCnt
  FROM BaseDates b
    LEFT OUTER JOIN @Example o
      ON b.Date = o.DateOpen
      AND (
               b.date < o.DateClosed 
            OR o.DateClosed IS NULL
          )
    LEFT OUTER JOIN @Example c
      ON b.Date = c.DateClosed
 GROUP BY b.Date
       ) a

Here we use the baseDates, and left outer join to the data on opens, and closeds for that day, counting the distinct values column. We use that as a subquery and apply a windowed sum function over the top to give us a running total, ordered by date.

Date OpenCnt ClosedCnt NetCnt
2023-01-26 1 0 1
2023-01-27 0 0 1
2023-01-28 0 0 1
2023-01-29 0 0 1
2023-01-30 0 0 1
2023-01-31 0 1 0
2023-02-01 2 0 2
2023-02-02 0 1 1

CodePudding user response:

Just another option using an ad-hoc tally/numbers table and a conditional aggregation. Then finally, the window function sum() over() to calculate the running NET.

You may notice I used TOP 1000 for the numbers table. Feel free to adjust to a more reasonable number (but be generous ... leave some wiggle room)

Example

;with cte as (
Select CalendarDate  
      ,CountOpen    = sum( case when [Date Open]  =CalendarDate then 1 else 0 end)
      ,CountClosed  = sum( case when [Date Closed]=CalendarDate then 1 else 0 end)
 From YourTable
 Join ( Select Top 1000 N=-1 Row_Number() Over (Order By (Select NULL)) From master..spt_values n1, master..spt_values n2 ) B
   on N<=datediff(day,[Date Open],coalesce([Date Closed],[Date Open]))
 Cross Apply ( values (dateadd(day,N,[Date Open])) ) C(CalendarDate)
 Group By CalendarDate
)
Select *
      ,Net = sum(CountOpen-CountClosed) over (order by CalendarDate)
 From  cte
 Order by CalendarDate

Results

enter image description here

  • Related