Home > Enterprise >  filter out entries with an uninterrupted Timeframe
filter out entries with an uninterrupted Timeframe

Time:09-12

I have raw data like below

INSP_EQP    SENDTIME    SENDTIME_NOW
Item A      00h             10
Item A      01h             10
Item A      02h             10
Item A      03h             10
Item A      04h             10
Item A      05h             10
Item A      06h             10
Item A      07h             10
Item A      09h             10
Item A      10h             10
Item B      00h             10
Item B      01h             10
Item B      02h             10
Item B      03h             10
Item B      04h             10
Item B      05h             10
Item B      06h             10
Item B      07h             10
Item B      08h             10
Item B      09h             10
Item B      10h             10

I want to choose which entry has only consecutive timeframes(at least 5 times) to the current time (SENDTIME_NOW = 10) without interruption. In this case, just keep the item B and remove item A. The result that I want is:

INSP_EQP    SENDTIME    SENDTIME_NOW
Item B      00h             10
Item B      01h             10
Item B      02h             10
Item B      03h             10
Item B      04h             10
Item B      05h             10
Item B      06h             10
Item B      07h             10
Item B      08h             10
Item B      09h             10
Item B      10h             10

I tried to find many functions, but it seems none of SQL oracle functions support this. Can anyone tell me if I should use SQL or use another language to do this. Thank you

CodePudding user response:

You can use match_recognize for that:

  1. You can check which rows suit your conditions, for example:

DBFiddle: https://dbfiddle.uk/RsuH2bG2

with t(INSP_EQP, SENDTIME, SENDTIME_NOW) as (
select 'Item A',   '00h',   10 from dual union all
select 'Item A',   '01h',   10 from dual union all
select 'Item A',   '02h',   10 from dual union all
select 'Item A',   '03h',   10 from dual union all
select 'Item A',   '04h',   10 from dual union all
select 'Item A',   '05h',   10 from dual union all
select 'Item A',   '06h',   10 from dual union all
select 'Item A',   '07h',   10 from dual union all
select 'Item A',   '09h',   10 from dual union all
select 'Item A',   '10h',   10 from dual union all
select 'Item B',   '00h',   10 from dual union all
select 'Item B',   '01h',   10 from dual union all
select 'Item B',   '02h',   10 from dual union all
select 'Item B',   '03h',   10 from dual union all
select 'Item B',   '04h',   10 from dual union all
select 'Item B',   '05h',   10 from dual union all
select 'Item B',   '06h',   10 from dual union all
select 'Item B',   '07h',   10 from dual union all
select 'Item B',   '08h',   10 from dual union all
select 'Item B',   '09h',   10 from dual union all
select 'Item B',   '10h',   10 from dual 
)
select *
from (-- inline view to make columns more readable
      select
         INSP_EQP
        ,to_number(rtrim(sendtime,'h')) SENDTIME_num
        ,SENDTIME
        ,SENDTIME_NOW
      from t
     )
match_recognize (
  partition by INSP_EQP
  order by SENDTIME_num
  measures
    classifier() as cls,
    count(*) as cnt,
    first(SENDTIME_num) as n_first,
    last (SENDTIME_num) as n_last
  ALL ROWS PER MATCH
  PATTERN (ok* err*)
      DEFINE
          ok    as  (ok.SENDTIME_num = 0 or ok.SENDTIME_num = prev(ok.SENDTIME_num) 1)
)

returns:

INSP_EQP   SENDTIME_NUM CLS          CNT    N_FIRST     N_LAST SENDTIME SENDTIME_NOW
---------- ------------ ----- ---------- ---------- ---------- -------- ------------
Item A                0 OK             1          0          0 00h                10
Item A                1 OK             2          0          1 01h                10
Item A                2 OK             3          0          2 02h                10
Item A                3 OK             4          0          3 03h                10
Item A                4 OK             5          0          4 04h                10
Item A                5 OK             6          0          5 05h                10
Item A                6 OK             7          0          6 06h                10
Item A                7 OK             8          0          7 07h                10
Item A                9 ERR            9          0          9 09h                10
Item A               10 ERR           10          0         10 10h                10
Item B                0 OK             1          0          0 00h                10
Item B                1 OK             2          0          1 01h                10
Item B                2 OK             3          0          2 02h                10
Item B                3 OK             4          0          3 03h                10
Item B                4 OK             5          0          4 04h                10
Item B                5 OK             6          0          5 05h                10
Item B                6 OK             7          0          6 06h                10
Item B                7 OK             8          0          7 07h                10
Item B                8 OK             9          0          8 08h                10
Item B                9 OK            10          0          9 09h                10
Item B               10 OK            11          0         10 10h                10

As you can see 'Item A' has 2 rows with 'Err', so you can now filter them out. For example using analytic functions:

DBFiddle 2: https://dbfiddle.uk/W5b-cE7L

select *
from (
    select 
       v.*
       ,count(case when cls='ERR' then 1 end)over(partition by insp_eqp) cnt_errs
    from (-- inline view to make columns more readable
          select
             INSP_EQP
            ,to_number(rtrim(sendtime,'h')) SENDTIME_num
            ,SENDTIME
            ,SENDTIME_NOW
          from t
         )
    match_recognize (
      partition by INSP_EQP
      order by SENDTIME_num
      measures
        classifier() as cls,
        count(*) as cnt,
        first(SENDTIME_num) as n_first,
        last (SENDTIME_num) as n_last
      ALL ROWS PER MATCH
      PATTERN (ok* err*)
          DEFINE
              ok    as  (ok.SENDTIME_num = 0 or ok.SENDTIME_num = prev(ok.SENDTIME_num) 1)
    ) v
)
where cnt_errs=0

It returns:

INSP_EQP   SENDTIME_NUM CLS          CNT    N_FIRST     N_LAST SENDTIME SENDTIME_NOW   CNT_ERRS
---------- ------------ ----- ---------- ---------- ---------- -------- ------------ ----------
Item B                0 OK             1          0          0 00h                10          0
Item B                1 OK             2          0          1 01h                10          0
Item B                2 OK             3          0          2 02h                10          0
Item B                3 OK             4          0          3 03h                10          0
Item B                4 OK             5          0          4 04h                10          0
Item B                5 OK             6          0          5 05h                10          0
Item B                6 OK             7          0          6 06h                10          0
Item B                7 OK             8          0          7 07h                10          0
Item B                8 OK             9          0          8 08h                10          0
Item B                9 OK            10          0          9 09h                10          0
Item B               10 OK            11          0         10 10h                10          0

11 rows selected.

CodePudding user response:

finally based on 1 answer in this post i was able to get the result i wanted. but there seems to be something wrong that makes me no longer see that person's reply. anyway thank you so much

select   INSP_EQP
        ,SENDTIME
        ,CHAMBER 
                ,AREA
        ,SENDTIME_NOW
from   (
        select  INSP_EQP
                        ,CHAMBER 
                                ,AREA
               ,SENDTIME
               ,SENDTIME_NOW
               ,SENDTIME_INT
               ,rn
               ,pre_2
               ,SENDTIME_INT-2
               ,case when RN = 1 and SENDTIME_INT-2 = pre_2 then 1 ELSE null END match
               ,count(case when RN = 1 and SENDTIME_INT-2 = pre_2 then 1 ELSE   null end) over(partition by INSP_EQP) as chk
        from   (
            select  INSP_EQP
                   ,SENDTIME
                   , SENDTIME_INT
                   ,CHAMBER 
                                     ,AREA
                   ,SENDTIME_NOW
                   ,row_number() over (partition by INSP_EQP order by SENDTIME_INT desc) as rn
                   ,lag(SENDTIME_INT, 2) over(partition by INSP_EQP order by SENDTIME_INT) as pre_2 
            from t
               ) t
        ) t
where    chk = 1
order by SENDTIME_INT
)
GROUP BY    INSP_EQP
          ,SENDTIME_NOW
          ,CHAMBER 
          ,AREA
ORDER   BY  COUNT(SENDTIME)
  • Related