Looking for ways to specify the start time of a PARTITION BY statement in SQL Server.
Partitioning a years worth of data into 60 minute segments. The data is 10 minute collections from an IOT device. Would like the partitions to start at 6AM each day.
How do I accomplish that fixed start time every day?
Here's some sample data. Want the windowing (partition) to start on the hour:
Sample data, 10 minute data sampling:
code | datetimePDT | data |
---|---|---|
AA01 | 12/15/2021 05:44 AM | 0100 |
AA02 | 12/15/2021 05:54 AM | 0105 |
AA03 | 12/15/2021 06:04 AM | 0103 |
AA04 | 12/15/2021 06:14 AM | 0109 |
AA05 | 12/15/2021 06:24 AM | 0112 |
AA06 | 12/15/2021 06:34 AM | 0115 |
AA07 | 12/15/2021 06:44 AM | 0119 |
AA08 | 12/15/2021 06:54 AM | 0125 |
AA09 | 12/15/2021 07:04 AM | 0135 |
AA10 | 12/15/2021 07:14 AM | 0155 |
AA11 | 12/15/2021 07:24 AM | 0195 |
In a stored procedure - Ranking by minute: dense_rank() over (order by datepart(day,datetimePDT), datepart(hour,datetimePDT), datepart(minute,datetimePDT)) minuteRank
Grouping minutes into hours: CEILING((minuteRank-1)/10) hourGroup
Then doing things like pulling out the average: avg(data) over (partition by hourGroup) as GroupAVG
Prefer the hourGroup to start at 6 AM, so my GroupAVG is over the rows from 6:04 to 6:54, and the next partition is from 7-8 AM.
To make this more complicated, there may be missing data, so I can't rely on the data collection period being 10 minutes.
Want to get here:
code | datetimePDT | data | minuteRank | hourGroup |
---|---|---|---|---|
AA01 | 12/15/2021 05:44 AM | 0100 | 01 | NULL |
AA02 | 12/15/2021 05:54 AM | 0105 | 02 | NULL |
AA03 | 12/15/2021 06:04 AM | 0103 | 03 | 0001 |
AA04 | 12/15/2021 06:14 AM | 0109 | 04 | 0001 |
AA05 | 12/15/2021 06:24 AM | 0112 | 05 | 0001 |
AA06 | 12/15/2021 06:34 AM | 0115 | 06 | 0001 |
AA07 | 12/15/2021 06:44 AM | 0119 | 07 | 0001 |
AA08 | 12/15/2021 06:54 AM | 0125 | 08 | 0001 |
AA09 | 12/15/2021 07:04 AM | 0135 | 09 | 0002 |
AA10 | 12/15/2021 07:14 AM | 0155 | 10 | 0002 |
AA11 | 12/15/2021 07:24 AM | 0195 | 11 | 0002 |
CodePudding user response:
Maybe something like this?
Declare @testTable table (MyTestDate datetime);
Insert Into @testTable (MyTestDate)
Values ('2022-02-12 04:06:57.683')
, ('2022-02-12 05:06:57.683')
, ('2022-02-12 06:06:57.683')
, ('2022-02-12 07:06:57.683')
, ('2022-02-12 08:06:57.683')
, ('2022-02-12 09:06:57.683')
, ('2022-02-12 10:06:57.683')
, ('2022-02-12 11:06:57.683')
, ('2022-02-12 12:06:57.683')
, ('2022-02-12 13:06:57.683')
, ('2022-02-12 14:06:57.683')
, ('2022-02-12 15:06:57.683')
, ('2022-02-12 16:06:57.683')
, ('2022-02-12 17:06:57.683')
, ('2022-02-12 18:06:57.683')
, ('2022-02-12 19:06:57.683')
, ('2022-02-12 20:06:57.683')
, ('2022-02-12 12:06:57.683')
, ('2022-02-13 04:06:57.683')
, ('2022-02-13 05:06:57.683')
, ('2022-02-13 06:06:57.683')
, ('2022-02-13 07:06:57.683')
, ('2022-02-13 08:06:57.683')
, ('2022-02-13 09:06:57.683')
, ('2022-02-13 10:06:57.683')
, ('2022-02-13 11:06:57.683')
, ('2022-02-13 12:06:57.683')
, ('2022-02-13 13:06:57.683')
, ('2022-02-13 14:06:57.683')
, ('2022-02-13 15:06:57.683')
, ('2022-02-13 16:06:57.683')
, ('2022-02-13 17:06:57.683')
, ('2022-02-13 18:06:57.683')
, ('2022-02-13 19:06:57.683')
, ('2022-02-13 20:06:57.683')
, ('2022-02-13 12:06:57.683');
Select *
, row_number() Over(Partition By t.start_date Order By tt.MyTestDate)
From @testTable tt
Cross Apply (Values(dateadd(day, datediff(day, '09:00', tt.MyTestDate) - iif(datepart(hour, tt.MyTestDate) < 9, 1, 0), '09:00'))) As t(start_date);
CodePudding user response:
What is the reasoning for PARTITION BY
instead of just GROUP BY
on the HOUR
? You could play around with it obviously and put the aggregation into a CTE if you still want to see the individual values.
DECLARE @Table TABLE (code VARCHAR(10), datetimePDT DATETIME, [data] INT)
INSERT INTO @Table VALUES
('AA01','12/15/2021 05:44 AM', 0100),
('AA02','12/15/2021 05:54 AM', 0105),
('AA03','12/15/2021 06:04 AM', 0103),
('AA04','12/15/2021 06:14 AM', 0109),
('AA05','12/15/2021 06:24 AM', 0112),
('AA06','12/15/2021 06:34 AM', 0115),
('AA07','12/15/2021 06:44 AM', 0119),
('AA08','12/15/2021 06:54 AM', 0125),
('AA09','12/15/2021 07:04 AM', 0135),
('AA10','12/15/2021 07:14 AM', 0155),
('AA11','12/15/2021 07:24 AM', 0195);
SELECT MIN(tt.code) AS FirstOfGroup,
MAX(tt.code) AS LastOfGroup,
COUNT(tt.code) AS NumberInGroup,
DATEPART(YEAR,tt.datetimePDT) AS [DataYear],
DATEPART(MONTH,tt.datetimePDT) AS [DataMonth],
DATEPART(DAY,tt.datetimePDT) AS [DataDay],
DATEPART(HOUR,tt.datetimePDT) AS [DataHour],
AVG(tt.[data]) AS AvgData
FROM @Table tt
GROUP BY DATEPART(YEAR,tt.datetimePDT),DATEPART(MONTH,tt.datetimePDT),DATEPART(DAY,tt.datetimePDT), DATEPART(HOUR,tt.datetimePDT)
FirstOfGroup | LastOfGroup | NumberInGroup | DataYear | DataMonth | DataDay | DataHour | AvgData |
---|---|---|---|---|---|---|---|
AA01 | AA02 | 2 | 2021 | 12 | 15 | 5 | 102 |
AA03 | AA08 | 6 | 2021 | 12 | 15 | 6 | 113 |
AA09 | AA11 | 3 | 2021 | 12 | 15 | 7 | 161 |