THE ISSUE
I need to get all defaults per day of a machine which logs are stored in a MySQL table every 5 seconds like this :
time | def1 | def2 | def3 | def4 | def5 |
---|---|---|---|---|---|
2022-09-01 12:00:00 | 0 | 0 | 0 | 0 | 0 |
2022-09-01 12:00:05 | 0 | 1 | 0 | 0 | 0 |
2022-09-01 12:00:10 | 0 | 1 | 0 | 0 | 0 |
2022-09-01 12:00:15 | 0 | 0 | 0 | 1 | 0 |
2022-09-01 12:00:20 | 1 | 0 | 0 | 1 | 0 |
2022-09-01 12:00:25 | 1 | 0 | 0 | 0 | 0 |
2022-09-01 12:00:30 | 0 | 1 | 0 | 0 | 0 |
2022-09-01 12:00:35 | 0 | 0 | 0 | 1 | 0 |
2022-09-01 12:00:40 | 0 | 0 | 0 | 0 | 0 |
2022-09-01 12:00:45 | 0 | 0 | 0 | 0 | 0 |
2022-09-02 12:00:00 | 0 | 0 | 0 | 0 | 0 |
2022-09-02 12:00:05 | 0 | 0 | 1 | 0 | 0 |
2022-09-02 12:00:10 | 1 | 0 | 0 | 0 | 0 |
2022-09-02 12:00:15 | 1 | 0 | 1 | 0 | 1 |
2022-09-02 12:00:20 | 1 | 0 | 1 | 0 | 1 |
2022-09-02 12:00:25 | 1 | 0 | 0 | 0 | 1 |
2022-09-02 12:00:30 | 1 | 1 | 0 | 0 | 1 |
2022-09-02 12:00:35 | 0 | 1 | 1 | 0 | 0 |
2022-09-02 12:00:40 | 0 | 0 | 1 | 0 | 0 |
2022-09-02 12:00:45 | 0 | 0 | 0 | 0 | 0 |
So the query need to count rising edge only for each column, grouped by date on time column.
Expected result :
date | def1 | def2 | def3 | def4 | def5 |
---|---|---|---|---|---|
2022-09-01 | 1 | 2 | 0 | 2 | 0 |
2022-09-02 | 1 | 1 | 3 | 0 | 1 |
I've got now about 30 columns and 3'800'000 records for 6 months of logs. Our customer wants to keep at least 6 months, ideally 2 years of logs.
So, for me the difficulties are to be efficient in storage but also to retrieve data as quick as possible.
FINDING SOLUTION
Based on the following logical (n = actual record, m = previous record) :
m 0 0 1 1
n 0 1 0 1
_______
x 0 1 0 0
We will get only rising edge (not falling edge) so change between 0 to 1 and ignore all other values :
Def3 on 2022-09-02 0 1 0 1 1 0 0 1 1 0
occurred 3 times ↑ ↑ ↑
finding solution :
n - m 0 1 -1 1 0 -1 0 1 0 -1
ABS(n - m) 0 1 1 1 0 1 0 1 0 1
n XOR m 0 1 1 1 0 1 0 1 0 1
So, how to get only rising edges !? In other words, how to get only 1
values as true
and -1/0
as false
CodePudding user response:
WITH
cte AS (
SELECT DATE(dt) dt, def1, def2, def3, def4, def5,
ROW_NUMBER() OVER (PARTITION BY DATE(dt) ORDER BY dt) rn
FROM test
)
SELECT dt,
SUM(t2.def1 > t1.def1) def1,
SUM(t2.def2 > t1.def2) def2,
SUM(t2.def3 > t1.def3) def3,
SUM(t2.def4 > t1.def4) def4,
SUM(t2.def5 > t1.def5) def5
FROM cte t1
JOIN cte t2 USING (dt)
WHERE t2.rn = t1.rn 1
GROUP BY 1;
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=c8402e9ea53a0aa580b18868c913465a
CodePudding user response:
After a lot of reading I found solution :
CAST(value AS BOOL/BOOLEAN)
is not available in MySQL, but CAST(value > 0 AS UNSIGNED)
is the workaround.
Database records :
Def3 on 2022-09-02 0 1 0 1 1 0 0 1 1 0
we want 3 only rising edges ↑ ↑ ↑
Finding rising edges :
CAST( (n - m) > 0 AS UNSIGNED) 0 1 0 1 0 0 0 1 0 0
rising edge only 3 times ↑ ↑ ↑
So, here is my solution to get defaults occurred every day, but I think it's not the best we can do :
SUM( CAST( (n - m) > 0 AS UNSIGNED) ) ... GROUP BY date(`time`) is now returning 3 as expected.
Complete query :
WITH `stats` AS (
SELECT
*,
DATE(`time`) AS `date`,
(@pos := ifnull(@pos, 0) 1) AS `pos`
FROM `def_log`
#WHERE ... // your filters here
ORDER BY `time` ASC
)
SELECT
SUM( CAST( (S2.`Def1` - S1.`Def1`) > 0 AS UNSIGNED) ) AS `Def1`,
SUM( CAST( (S2.`Def2` - S1.`Def2`) > 0 AS UNSIGNED) ) AS `Def2`,
SUM( CAST( (S2.`Def3` - S1.`Def3`) > 0 AS UNSIGNED) ) AS `Def3`,
SUM( CAST( (S2.`Def4` - S1.`Def4`) > 0 AS UNSIGNED) ) AS `Def4`,
SUM( CAST( (S2.`Def5` - S1.`Def5`) > 0 AS UNSIGNED) ) AS `Def5`
FROM `stats` S1
INNER JOIN `stats` S2
ON S1.`pos` = S2.`pos` - 1
GROUP BY S2.`date`;