Home > database >  Partition BY start time of day in SQL Server
Partition BY start time of day in SQL Server

Time:02-15

Looking for ways to specify the start time of a PARTITION BY statement in SQL Server.

Partitioning a years worth of data into 60 minute segments. The data is 10 minute collections from an IOT device. Would like the partitions to start at 6AM each day.

How do I accomplish that fixed start time every day?

Here's some sample data. Want the windowing (partition) to start on the hour:

Sample data, 10 minute data sampling:

code datetimePDT data
AA01 12/15/2021 05:44 AM 0100
AA02 12/15/2021 05:54 AM 0105
AA03 12/15/2021 06:04 AM 0103
AA04 12/15/2021 06:14 AM 0109
AA05 12/15/2021 06:24 AM 0112
AA06 12/15/2021 06:34 AM 0115
AA07 12/15/2021 06:44 AM 0119
AA08 12/15/2021 06:54 AM 0125
AA09 12/15/2021 07:04 AM 0135
AA10 12/15/2021 07:14 AM 0155
AA11 12/15/2021 07:24 AM 0195

In a stored procedure - Ranking by minute: dense_rank() over (order by datepart(day,datetimePDT), datepart(hour,datetimePDT), datepart(minute,datetimePDT)) minuteRank

Grouping minutes into hours: CEILING((minuteRank-1)/10) hourGroup

Then doing things like pulling out the average: avg(data) over (partition by hourGroup) as GroupAVG

Prefer the hourGroup to start at 6 AM, so my GroupAVG is over the rows from 6:04 to 6:54, and the next partition is from 7-8 AM.

To make this more complicated, there may be missing data, so I can't rely on the data collection period being 10 minutes.

Want to get here:

code datetimePDT data minuteRank hourGroup
AA01 12/15/2021 05:44 AM 0100 01 NULL
AA02 12/15/2021 05:54 AM 0105 02 NULL
AA03 12/15/2021 06:04 AM 0103 03 0001
AA04 12/15/2021 06:14 AM 0109 04 0001
AA05 12/15/2021 06:24 AM 0112 05 0001
AA06 12/15/2021 06:34 AM 0115 06 0001
AA07 12/15/2021 06:44 AM 0119 07 0001
AA08 12/15/2021 06:54 AM 0125 08 0001
AA09 12/15/2021 07:04 AM 0135 09 0002
AA10 12/15/2021 07:14 AM 0155 10 0002
AA11 12/15/2021 07:24 AM 0195 11 0002

CodePudding user response:

Maybe something like this?

Declare @testTable table (MyTestDate datetime);
 Insert Into @testTable (MyTestDate)
 Values ('2022-02-12 04:06:57.683')
      , ('2022-02-12 05:06:57.683')
      , ('2022-02-12 06:06:57.683')
      , ('2022-02-12 07:06:57.683')
      , ('2022-02-12 08:06:57.683')
      , ('2022-02-12 09:06:57.683')
      , ('2022-02-12 10:06:57.683')
      , ('2022-02-12 11:06:57.683')
      , ('2022-02-12 12:06:57.683')
      , ('2022-02-12 13:06:57.683')
      , ('2022-02-12 14:06:57.683')
      , ('2022-02-12 15:06:57.683')
      , ('2022-02-12 16:06:57.683')
      , ('2022-02-12 17:06:57.683')
      , ('2022-02-12 18:06:57.683')
      , ('2022-02-12 19:06:57.683')
      , ('2022-02-12 20:06:57.683')
      , ('2022-02-12 12:06:57.683')
      , ('2022-02-13 04:06:57.683')
      , ('2022-02-13 05:06:57.683')
      , ('2022-02-13 06:06:57.683')
      , ('2022-02-13 07:06:57.683')
      , ('2022-02-13 08:06:57.683')
      , ('2022-02-13 09:06:57.683')
      , ('2022-02-13 10:06:57.683')
      , ('2022-02-13 11:06:57.683')
      , ('2022-02-13 12:06:57.683')
      , ('2022-02-13 13:06:57.683')
      , ('2022-02-13 14:06:57.683')
      , ('2022-02-13 15:06:57.683')
      , ('2022-02-13 16:06:57.683')
      , ('2022-02-13 17:06:57.683')
      , ('2022-02-13 18:06:57.683')
      , ('2022-02-13 19:06:57.683')
      , ('2022-02-13 20:06:57.683')
      , ('2022-02-13 12:06:57.683');

Select *
     , row_number() Over(Partition By t.start_date Order By tt.MyTestDate)
  From @testTable tt
 Cross Apply (Values(dateadd(day, datediff(day, '09:00', tt.MyTestDate) - iif(datepart(hour, tt.MyTestDate) < 9, 1, 0), '09:00'))) As t(start_date);

CodePudding user response:

What is the reasoning for PARTITION BY instead of just GROUP BY on the HOUR? You could play around with it obviously and put the aggregation into a CTE if you still want to see the individual values.

DECLARE @Table TABLE (code VARCHAR(10), datetimePDT DATETIME, [data] INT)
INSERT INTO @Table VALUES
('AA01','12/15/2021 05:44 AM', 0100),
('AA02','12/15/2021 05:54 AM', 0105),
('AA03','12/15/2021 06:04 AM', 0103),
('AA04','12/15/2021 06:14 AM', 0109),
('AA05','12/15/2021 06:24 AM', 0112),
('AA06','12/15/2021 06:34 AM', 0115),
('AA07','12/15/2021 06:44 AM', 0119),
('AA08','12/15/2021 06:54 AM', 0125),
('AA09','12/15/2021 07:04 AM', 0135),
('AA10','12/15/2021 07:14 AM', 0155),
('AA11','12/15/2021 07:24 AM', 0195);


SELECT MIN(tt.code) AS FirstOfGroup,
    MAX(tt.code) AS LastOfGroup,
    COUNT(tt.code) AS NumberInGroup,
    DATEPART(YEAR,tt.datetimePDT) AS [DataYear],
    DATEPART(MONTH,tt.datetimePDT) AS [DataMonth],
    DATEPART(DAY,tt.datetimePDT) AS [DataDay],
    DATEPART(HOUR,tt.datetimePDT) AS [DataHour],
    AVG(tt.[data]) AS AvgData
FROM @Table tt
GROUP BY DATEPART(YEAR,tt.datetimePDT),DATEPART(MONTH,tt.datetimePDT),DATEPART(DAY,tt.datetimePDT), DATEPART(HOUR,tt.datetimePDT)
FirstOfGroup LastOfGroup NumberInGroup DataYear DataMonth DataDay DataHour AvgData
AA01 AA02 2 2021 12 15 5 102
AA03 AA08 6 2021 12 15 6 113
AA09 AA11 3 2021 12 15 7 161
  • Related