Home > other >  How to find missing hours and show data as 0
How to find missing hours and show data as 0

Time:05-18

This query pulls counts for every hour of every day but leaves off any days and hours that don't have counts. How can I fill in the missing dates and hours between the dates selected with 0 for the counts?

SELECT CAST(RecordTime AS date) AS Date, DATENAME(dw, RecordTime) AS [Day of the week], DATEPART(hour, RecordTime) AS [Hour of the day], COUNT(*) AS [Hourly Count]
    FROM Counts
    WHERE (RecordTime >= CONVERT(DATETIME, '2022-04-01 00:00:00', 102)) AND (RecordTime < CONVERT(DATETIME, '2022-05-01 00:00:00', 102)) AND (MachineNum = 11) AND (Cavity = 1)
    GROUP BY CAST(RecordTime AS date), DATEPART(hour, RecordTime), DATENAME(dw, RecordTime)
    ORDER BY Date, [Hour of the day]

CodePudding user response:

As Larnu suggested, you need to generate a full dataset with all day hour combinations in the range, in order to left join to. By my calculation you need 30 days * 24 hours = 720 rows. If you don't already have a numbers table, or a calendar table, or a sequence generating function, you can generate this using recursive CTEs as follows:

DECLARE @StartDate     datetime = '20220401',
        @AfterLastDate datetime = '20220501';
        
;WITH days(d) AS
(
  SELECT 0 UNION ALL SELECT d 1 FROM days
  WHERE d < DATEDIFF(DAY, @StartDate, @AfterLastDate) - 1
), hours(h) AS
(
  SELECT 0 UNION ALL SELECT h 1 FROM hours WHERE h<23
),
dates(DayHour, h) AS
(
  SELECT DATEADD(HOUR, hours.h, DATEADD(DAY, days.d, @StartDate)), hours.h
  FROM days CROSS JOIN hours
)
SELECT d.DayHour, DATENAME(WEEKDAY, DayHour), d.h
  FROM dates AS d
  ORDER BY d.DayHour;

Output:

DayHour Day of the week Hour of the day
2022-04-01 00:00:00.000 Friday 0
2022-04-01 01:00:00.000 Friday 1
2022-04-01 02:00:00.000 Friday 2
... 714 more rows ...
2022-04-30 21:00:00.000 Saturday 21
2022-04-30 22:00:00.000 Saturday 22
2022-04-30 23:00:00.000 Saturday 23

Now, we just need to left outer join that against your existing table:

DECLARE @StartDate     datetime = '20220401',
        @AfterLastDate datetime = '20220501';
        
;WITH days(d) AS
(
  SELECT 0 UNION ALL SELECT d 1 FROM days
  WHERE d < DATEDIFF(DAY, @StartDate, @AfterLastDate) - 1
), hours(h) AS
(
  SELECT 0 UNION ALL SELECT h 1 FROM hours WHERE h<23
),
dates(DayHour, h) AS
(
  SELECT DATEADD(HOUR, hours.h, DATEADD(DAY, days.d, @StartDate)), 
    hours.h FROM days CROSS JOIN hours
)
SELECT [Date] = CONVERT(date, d.DayHour), 
  [Day of the week] = DATENAME(WEEKDAY, d.DayHour), 
  [Hour of the day] = d.h,
  [Hourly Count] = COUNT(c.RecordTime)
FROM dates AS d
LEFT OUTER JOIN dbo.Counts AS c
ON c.RecordTime >= d.DayHour
  AND c.RecordTime < DATEADD(HOUR, 1, d.DayHour)
  AND c.MachineNum = 11
  AND c.Cavity = 1
GROUP BY CONVERT(date, d.DayHour), DATENAME(WEEKDAY, DayHour), d.h
ORDER BY [Date], [Hour of the day];

If you have a numbers table, the generation of dates is a little easier. This is a simple example and only contains 1,000 rows as the largest range of dates you expect, and uses a recursive CTE - there are multiple ways to populate a numbers table initially and the performance there is not important.

CREATE TABLE dbo.Numbers(n int PRIMARY KEY);

;WITH x(x) AS 
(
  SELECT 0 UNION ALL SELECT x 1 FROM x 
  WHERE x < 1000
)
INSERT dbo.Numbers(n) 
  SELECT x FROM x OPTION (MAXRECURSION 0);

Now the query to get all the dates in the range:

DECLARE @StartDate datetime = '20220401',
        @AfterLastDate datetime = '20220501';
       
;WITH dates(d) AS 
(
  SELECT TOP (DATEDIFF(DAY, @StartDate, @AfterLastDate)) n 
  FROM dbo.Numbers ORDER BY n
),
hours(h) AS 
(
  SELECT TOP (24) n FROM dbo.Numbers ORDER BY n
)
SELECT DayHour = DATEADD(HOUR, hours.h, 
    DATEADD(DAY, dates.d, @StartDate))
  FROM dates CROSS JOIN hours
  ORDER BY DayHour;

You can then use that as the core dataset to left join to just like in the above examples.

CodePudding user response:

You need to have a reference table which stores all the dates and respective hour. It will be kind of a Calendar table with hour added to it.

Step 1:

Create table hours.

CREATE TABLE hours
  (
     hour_key INTEGER
  );

INSERT INTO hours
VALUES      (0),
            (1),
            (2),
            (3),
            (4),
            (5),
            (6),
            (7),
            (8),
            (9),
            (10),
            (11),
            (12),
            (13),
            (14),
            (15),
            (16),
            (17),
            (18),
            (19),
            (20),
            (21),
            (22),
            (23); 

Step 2:

Create a table which will have the required date and hour field.

SELECT Cast(recordtime AS DATE) AS Date_key,
               Datename(dw, recordtime) week_key,
               hour_key
        FROM   hours
               CROSS JOIN counts
        GROUP  BY Cast(recordtime AS DATE),
                  Datename(dw, recordtime),
                  hour_key

Note: Using Counts table to derive the date and hour field is a poor solution, As it might be a transactional table and have huge number of records. Instead use a Calendar table.

Final Step:

Use the table created in Step 2 as the main table(or use as subquery) and left join Counts with it based on Date and hour.

Below query should give you the desired output.

SELECT date_key            AS Date,
       week_key            AS [Day of the week],
       hour_key            AS [Hour of the day],
       Count(c.recordtime) AS [Hourly Count]
FROM   (SELECT Cast(recordtime AS DATE) AS Date_key,
               Datename(dw, recordtime) week_key,
               hour_key
        FROM   hours
               CROSS JOIN counts
        GROUP  BY Cast(recordtime AS DATE),
                  Datename(dw, recordtime),
                  hour_key)cal
       LEFT JOIN counts c
              ON cal.date_key = Cast(c.recordtime AS DATE)
                 AND cal.hour_key = Datepart(hour, c.recordtime)
GROUP  BY date_key,
          week_key,
          hour_key
ORDER  BY date,
          [hour of the day] 

SQL Fiddle: Try it here

  • Related