Home > Net >  How to use sql/mysql to handle null values and certain cases?
How to use sql/mysql to handle null values and certain cases?

Time:03-24

suppose we have the table below to track the shipping department's machine. It typically takes 5 mins for the machine to scan certain numbers of packages. However, it might have some problems during the scan process. Therefore, we would like to track each of the scan event durations? (Please refer to the below table)

date eventTime event actions
2020-02-01 2020-02-01 10:05:25 scan_package start
2020-02-01 2020-02-01 10:10:25 scan_package end
2020-02-01 2020-02-01 10:25:25 scan_package start
2020-02-01 2020-02-01 10:30:25 error
2020-02-01 2020-02-01 10:35:25 scan_package start
2020-02-01 2020-02-01 10:40:25 scan_package end
2020-02-01 2020-02-01 10:42:25 scan_package start
2020-02-01 2020-02-01 10:47:25 scan_package end
2020-02-01 2020-02-01 10:51:25 scan_package start
2020-02-01 scan_package start
2020-02-02 2020-02-02 08:45:25 scan_package start

The scan duration is equivalent: (event = scan_package & action = end) - (event = scan_package & action = start)

However, there are some corner cases that need to handle.

  1. for row3 (eventTime:2020-02-01_10:25:25) from the table, there is no corresponding end scan event, due to within 5 mins after the scan started, there is no scan ended. Therefore, we will not have the scan duration for this row, but we still need to record the event to the final table. (not sure how to record in this case)

Basically, in order to consider a valid scan duration, the start and end should be within 6 mins.

  1. will not consider the scan event is valid if there is no eventTime like (last second row from the table), we will just ignore the scan event.

  2. assume there are no situations like (actions = start, start, start, then end ...) which means there will not be multiple start actions happening at the same time or in sequence.

  3. assume the column eventTime is timestamp type

Sample desired final output table:

date eventTime1 event1 eventTime2 event2 scan_duration_in_mins
2020-02-01 2020-02-01 10:05:25 scan_start 2020-02-01 10:10:25 scan_end 5
2020-02-01 2020-02-01 10:10:25 scan_start null null null
2020-02-01 2020-02-01 10:35:25 scan_start 2020-02-01 10:40:25 scan_end 5
2020-02-01 2020-02-01 10:42:25 scan_start 2020-02-01 10:47:25 scan_end 5
2020-02-01 2020-02-01 10:51:25 scan_start null null null

Below is some of my code,

CREATE TABLE scan (
    date DATE,
    eventTime datetime,
    event VARCHAR(100),
    actions VARCHAR(100)
);

INSERT INTO scan VALUES ( '2020-02-01' , '2020-02-01 10:05:25', 'scan_package' , 'start');
INSERT INTO scan VALUES ( '2020-02-01' , '2020-02-01 10:10:25', 'scan_package' , 'end');
INSERT INTO scan VALUES ( '2020-02-01' , '2020-02-01 10:25:25', 'scan_package' , 'start');
INSERT INTO scan VALUES ( '2020-02-01' , '2020-02-01 10:30:25', 'error' ,          null );
INSERT INTO scan VALUES ( '2020-02-01' , '2020-02-01 10:35:25', 'scan_package' , 'start');
INSERT INTO scan VALUES ( '2020-02-01' , '2020-02-01 10:40:25', 'scan_package' , 'end');
INSERT INTO scan VALUES ( '2020-02-01' , '2020-02-01 10:42:25', 'scan_package' , 'start');
INSERT INTO scan VALUES ( '2020-02-01' , '2020-02-01 10:47:25', 'scan_package' , 'end');
INSERT INTO scan VALUES ( '2020-02-01' , '2020-02-01 10:51:25', 'scan_package' , 'start');
INSERT INTO scan VALUES ( '2020-02-01' ,  null ,                'scan_package' , 'start');
INSERT INTO scan VALUES ( '2020-02-02' , '2020-02-02 08:45:25', 'scan_package' , 'start');

SELECT 
    t1.date as date,
    t1.eventTime as scan_start_time,
    t1.event as scan_start,
    t2.eventTime as scan_end_time,
    t2.event as scan_end
FROM  scan as t1 
LEFT JOIN scan as t2 ON t2.eventTime BETWEEN t1.eventTime   INTERVAL 4 MINUTE
                                       AND t1.eventTime   INTERVAL 6 MINUTE
                  AND t1.actions = 'start'
                  AND t2.actions <> 'start';
                 

However, the code above did not get the desired output. Can someone please provide me with some suggestions or SQL code? Thanks a lot

CodePudding user response:

in order to consider a valid scan duration, the start and end should be within 6 mins.

SELECT *, TIMESTAMPDIFF(MINUTE, t1.eventTime, t2.eventTime) duration
FROM scan t1
LEFT JOIN scan t2 ON t2.eventTime BETWEEN t1.eventTime   INTERVAL 4 MINUTE
                                      AND t1.eventTime   INTERVAL 6 MINUTE
                  AND t2.actions <> 'start'
WHERE t1.actions = 'start'
  AND t1.eventTime IS NOT NULL;

https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=ab9a56a4f12d2462139ecba24eaacbdf

  • Related