we've input data like -
Where plc_id is the id of a particular PLC zero-speed - 0 means machine is running, 1 means machine is stopped.
We want to calculate the downtime and store it into the sql. Here is the stream analytics query We've written -
WITH SelectPreviousEvent AS
(
SELECT
*,
LAG([time]) OVER (LIMIT DURATION(hour, 24)) as previous_time,
LAG([zero-speed]) OVER (LIMIT DURATION(hour, 24)) as previous_speed,
LAG([plc_id]) OVER (LIMIT DURATION(hour,24)) as previous_plc,
[plc_id]
FROM [iot-input] TIMESTAMP BY [time]
)
-- selects time when current_speed is equal to 1 and previous_speed is equal to 0
SELECT
LAG(time) OVER (LIMIT DURATION(hour, 24) WHEN previous_speed = 0 ) [started_time],
previous_time [end_time],
[plc_id],
[created_by] = 'Stream Analytics',
[updated_by] = 'Stream Analytics',
[duration] = DATEDIFF(second, LAG(time) OVER (LIMIT DURATION(hour, 24) WHEN previous_speed = 0 ), previous_time)
INTO [sql-output]
FROM SelectPreviousEvent
WHERE
[zero-speed] = 0
AND previous_speed = 1
It is working as expected when there is data from only one plc_id.
Output -
And we've verified this downtime with customer, they said its correct.
But if we want to calculate downtime for multiple PLCs, like if there is one more PLC called plc5 and it is also sending the data like plc4, we're getting unexpected results.
I've written few other queries but none of them worked. I couldn't get much help from documentation on this scenario.
Thanks in advance.
CodePudding user response:
You need to PARTITION BY plc_id
in LAG
so that the calculation is done for that plc_id only. See that doc for more info.
Here that gives:
...
LAG([time]) OVER (PARTITION BY plc_id LIMIT DURATION(hour, 24)) as previous_time,
LAG([zero-speed]) OVER (PARTITION BY plc_id LIMIT DURATION(hour, 24)) as previous_speed,
...
You shouldn't use it for LAG(plc_id)
, or else you will get the same plc_id. But you should need it again for the following calculations.