Home > Enterprise >  Counting length of date period and number of periods
Counting length of date period and number of periods

Time:05-26

I can not show a query that isn't working. Because I simply can not even wrap my brain around how to do this. I can count number of devices and number of dates for a specific device.

I have a table containing a series of dates where we have seen a specific device. Devices are registered daily, if they were seen.

I would like to count the number of breaks in the timestamp flow. So if there is a date missing in the timestamp it should register as a new block for this device.

Example data:

Since today is the 25th of may:

  • Device00 should count 3 times (3 starts [green], and 3 stops [red])
  • Device58 should count 2 times (3 starts [green], and 2 stops [red])

Device58, is not counted because it was active today (25/5). if Device58 is not active tomorrow, then it should be counted. If it is active tomorrow then it should not be counted.

Devices are only counted when there is at least a calendar day between entry's in the data.

Data are collected once a day (4 am) and is not time critical.

Can this be done with a SQL query? My data set is currently 100.000 rows, so I'm hoping for an "easy" way to count these. But I can't quite figure it out.

DATA:

Device TimeStamp
Device00 2022-04-25
Device00 2022-04-26
Device00 2022-04-27
Device00 2022-04-28
Device00 2022-04-29
Device00 2022-04-30
Device00 2022-05-01
Device00 2022-05-02
Device00 2022-05-03
Device00 2022-05-04
Device00 2022-05-05
Device00 2022-05-06
Device00 2022-05-07
Device00 2022-05-08
Device00 2022-05-09
Device00 2022-05-11
Device00 2022-05-12
Device00 2022-05-14
Device00 2022-05-15
Device00 2022-05-16
Device00 2022-05-17
Device00 2022-05-18
Device00 2022-05-19
Device58 2022-04-25
Device58 2022-04-26
Device58 2022-04-27
Device58 2022-04-28
Device58 2022-04-29
Device58 2022-04-30
Device58 2022-05-01
Device58 2022-05-02
Device58 2022-05-03
Device58 2022-05-04
Device58 2022-05-05
Device58 2022-05-06
Device58 2022-05-07
Device58 2022-05-08
Device58 2022-05-09
Device58 2022-05-11
Device58 2022-05-12
Device58 2022-05-14
Device58 2022-05-15
Device58 2022-05-16
Device58 2022-05-17
Device58 2022-05-18
Device58 2022-05-19
Device58 2022-05-20
Device58 2022-05-21
Device58 2022-05-22
Device58 2022-05-23
Device58 2022-05-24
Device58 2022-05-25

Expected result:

Device Count
Device00 3
Device58 2

CodePudding user response:

SELECT t2.device,
  CASE WHEN t2.starts > t2.stops THEN t2.stops ELSE t2.starts END
FROM
(
  SELECT device,
    COUNT(CASE WHEN t.prev IS null OR t.prev > 1 THEN 1 ELSE null END) AS starts,
    COUNT(CASE WHEN t.next IS null OR t.next > 1 THEN 1 ELSE null END) AS stops
  FROM
  (
    SELECT *,
     DATEDIFF(day, LAG(timestamp) OVER(PARTITION BY device ORDER BY timestamp), timestamp)
     AS prev,
     DATEDIFF(day, timestamp, LEAD(timestamp) OVER(PARTITION BY device ORDER BY timestamp))
     AS next
    FROM tbl
  ) AS t
  WHERE timestamp <= DATEADD(day, -1, GETDATE())
  GROUP BY device
) AS t2;

The core idea of the query is utilizing "window function" LAG() and LEAD() which introduced in SQL Server 2012. Both function allows for queries to be aware to what comes before and after of the current row.

CodePudding user response:

Requires at least SQL Server 2012. In earlier versions you could do it with a cursor, I guess. The @Device variable stores the list of devices, @Log is your sample data. By default the result set shows aggregates for the last month, but if you want to see more - change the @Months variable.

DECLARE @Log TABLE ([Device] VARCHAR(50), [TimeStamp] DATE)
DECLARE @Device TABLE ([Device] VARCHAR(50))

--=========================================
--START OF TEST DATA
INSERT INTO @Log VALUES ('Device00', '2022-04-25'), ('Device00', '2022-04-26'), ('Device00', '2022-04-27'), ('Device00', '2022-04-28'), ('Device00', '2022-04-29'), ('Device00', '2022-04-30'), ('Device00', '2022-05-01'), ('Device00', '2022-05-02'), ('Device00', '2022-05-03'), ('Device00', '2022-05-04'), ('Device00', '2022-05-05'), ('Device00', '2022-05-06'), ('Device00', '2022-05-07'), ('Device00', '2022-05-08'), ('Device00', '2022-05-09'), ('Device00', '2022-05-11'), ('Device00', '2022-05-12'), ('Device00', '2022-05-14'), ('Device00', '2022-05-15'), ('Device00', '2022-05-16'), ('Device00', '2022-05-17'), ('Device00', '2022-05-18'), ('Device00', '2022-05-19'), ('Device58', '2022-04-25'), ('Device58', '2022-04-26'), ('Device58', '2022-04-27'), ('Device58', '2022-04-28'), ('Device58', '2022-04-29'), ('Device58', '2022-04-30'), ('Device58', '2022-05-01'), ('Device58', '2022-05-02'), ('Device58', '2022-05-03'), ('Device58', '2022-05-04'), ('Device58', '2022-05-05'), ('Device58', '2022-05-06'), ('Device58', '2022-05-07'), ('Device58', '2022-05-08'), ('Device58', '2022-05-09'), ('Device58', '2022-05-11'), ('Device58', '2022-05-12'), ('Device58', '2022-05-14'), ('Device58', '2022-05-15'), ('Device58', '2022-05-16'), ('Device58', '2022-05-17'), ('Device58', '2022-05-18'), ('Device58', '2022-05-19'), ('Device58', '2022-05-20'), ('Device58', '2022-05-21'), ('Device58', '2022-05-22'), ('Device58', '2022-05-23'), ('Device58', '2022-05-24'), ('Device58', '2022-05-25')

INSERT INTO @Device VALUES ('Device00'), ('Device58')
--END OF TEST DATA
--=========================================

DECLARE @Months INT = 1
DECLARE @Today DATE = GETDATE()
DECLARE @EarliestDate DATE = DATEADD(M, -@Months, @Today)

; WITH CTE
AS
(
    SELECT @Today AS [Date]
    UNION ALL
    SELECT DATEADD(D, -1, [Date])
    FROM CTE WHERE [Date] > @EarliestDate
)

SELECT [Device], COUNT([Starts]) AS [Starts], COUNT([Stops]) AS [Stops] FROM
(
SELECT
    [DeviceDates].[Device],
    CASE 
        WHEN
            (
                LEAD([Log].[TimeStamp], 1) OVER (ORDER BY [DeviceDates].[Device], [DeviceDates].[Date]) IS NOT NULL
                AND [Log].[TimeStamp] IS NULL
            )
        OR
            (
                ROW_NUMBER() OVER (PARTITION BY [DeviceDates].[Device] ORDER BY [DeviceDates].[Date]) = 1
                AND [Log].[TimeStamp] IS NOT NULL
            )
        THEN 1
    END [Starts],
    CASE 
        WHEN [DeviceDates].[Date] != @Today
        AND [Log].[TimeStamp] IS NOT NULL
        AND LEAD([Log].[TimeStamp], 1) OVER (ORDER BY [DeviceDates].[Device], [DeviceDates].[Date]) IS NULL
        THEN 1
    END [Stops]
FROM (
    SELECT Device, [Date] 
    FROM CTE
    CROSS JOIN @Device
) AS [DeviceDates]
LEFT OUTER JOIN @Log AS [Log]
    ON [DeviceDates].[Device] = [Log].[Device]
    AND [DeviceDates].[Date] = [Log].[TimeStamp]
) aggregated
GROUP BY [Device]

CodePudding user response:

with data as (
    select *,
        datediff(day, cast('20000101' as date), "TimeStamp")
          - row_number() over (partition by Device order by "TimeStamp") as grp
    from T
)
select distinct Device, count(*) over (partition by Device) as "Count"
from data
group by Device, grp
-- eliminate anything that still on a streak through yesterday
having max(Timestamp) < dateadd(day, -1, cast(getdate() as date));

Uses standard gaps and islands numbering. It does rely on row_number() so it will work on 2005 editions and later.

https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=b1698ddc0350c3087a3fdaf9554c54ad

  • Related