I have raw data like below
INSP_EQP SENDTIME SENDTIME_NOW
Item A 00h 10
Item A 01h 10
Item A 02h 10
Item A 03h 10
Item A 04h 10
Item A 05h 10
Item A 06h 10
Item A 07h 10
Item A 09h 10
Item A 10h 10
Item B 00h 10
Item B 01h 10
Item B 02h 10
Item B 03h 10
Item B 04h 10
Item B 05h 10
Item B 06h 10
Item B 07h 10
Item B 08h 10
Item B 09h 10
Item B 10h 10
I want to choose which entry has only consecutive timeframes(at least 5 times) to the current time (SENDTIME_NOW = 10) without interruption. In this case, just keep the item B and remove item A. The result that I want is:
INSP_EQP SENDTIME SENDTIME_NOW
Item B 00h 10
Item B 01h 10
Item B 02h 10
Item B 03h 10
Item B 04h 10
Item B 05h 10
Item B 06h 10
Item B 07h 10
Item B 08h 10
Item B 09h 10
Item B 10h 10
I tried to find many functions, but it seems none of SQL oracle functions support this. Can anyone tell me if I should use SQL or use another language to do this. Thank you
CodePudding user response:
You can use match_recognize
for that:
- You can check which rows suit your conditions, for example:
DBFiddle: https://dbfiddle.uk/RsuH2bG2
with t(INSP_EQP, SENDTIME, SENDTIME_NOW) as (
select 'Item A', '00h', 10 from dual union all
select 'Item A', '01h', 10 from dual union all
select 'Item A', '02h', 10 from dual union all
select 'Item A', '03h', 10 from dual union all
select 'Item A', '04h', 10 from dual union all
select 'Item A', '05h', 10 from dual union all
select 'Item A', '06h', 10 from dual union all
select 'Item A', '07h', 10 from dual union all
select 'Item A', '09h', 10 from dual union all
select 'Item A', '10h', 10 from dual union all
select 'Item B', '00h', 10 from dual union all
select 'Item B', '01h', 10 from dual union all
select 'Item B', '02h', 10 from dual union all
select 'Item B', '03h', 10 from dual union all
select 'Item B', '04h', 10 from dual union all
select 'Item B', '05h', 10 from dual union all
select 'Item B', '06h', 10 from dual union all
select 'Item B', '07h', 10 from dual union all
select 'Item B', '08h', 10 from dual union all
select 'Item B', '09h', 10 from dual union all
select 'Item B', '10h', 10 from dual
)
select *
from (-- inline view to make columns more readable
select
INSP_EQP
,to_number(rtrim(sendtime,'h')) SENDTIME_num
,SENDTIME
,SENDTIME_NOW
from t
)
match_recognize (
partition by INSP_EQP
order by SENDTIME_num
measures
classifier() as cls,
count(*) as cnt,
first(SENDTIME_num) as n_first,
last (SENDTIME_num) as n_last
ALL ROWS PER MATCH
PATTERN (ok* err*)
DEFINE
ok as (ok.SENDTIME_num = 0 or ok.SENDTIME_num = prev(ok.SENDTIME_num) 1)
)
returns:
INSP_EQP SENDTIME_NUM CLS CNT N_FIRST N_LAST SENDTIME SENDTIME_NOW
---------- ------------ ----- ---------- ---------- ---------- -------- ------------
Item A 0 OK 1 0 0 00h 10
Item A 1 OK 2 0 1 01h 10
Item A 2 OK 3 0 2 02h 10
Item A 3 OK 4 0 3 03h 10
Item A 4 OK 5 0 4 04h 10
Item A 5 OK 6 0 5 05h 10
Item A 6 OK 7 0 6 06h 10
Item A 7 OK 8 0 7 07h 10
Item A 9 ERR 9 0 9 09h 10
Item A 10 ERR 10 0 10 10h 10
Item B 0 OK 1 0 0 00h 10
Item B 1 OK 2 0 1 01h 10
Item B 2 OK 3 0 2 02h 10
Item B 3 OK 4 0 3 03h 10
Item B 4 OK 5 0 4 04h 10
Item B 5 OK 6 0 5 05h 10
Item B 6 OK 7 0 6 06h 10
Item B 7 OK 8 0 7 07h 10
Item B 8 OK 9 0 8 08h 10
Item B 9 OK 10 0 9 09h 10
Item B 10 OK 11 0 10 10h 10
As you can see 'Item A' has 2 rows with 'Err', so you can now filter them out. For example using analytic functions:
DBFiddle 2: https://dbfiddle.uk/W5b-cE7L
select *
from (
select
v.*
,count(case when cls='ERR' then 1 end)over(partition by insp_eqp) cnt_errs
from (-- inline view to make columns more readable
select
INSP_EQP
,to_number(rtrim(sendtime,'h')) SENDTIME_num
,SENDTIME
,SENDTIME_NOW
from t
)
match_recognize (
partition by INSP_EQP
order by SENDTIME_num
measures
classifier() as cls,
count(*) as cnt,
first(SENDTIME_num) as n_first,
last (SENDTIME_num) as n_last
ALL ROWS PER MATCH
PATTERN (ok* err*)
DEFINE
ok as (ok.SENDTIME_num = 0 or ok.SENDTIME_num = prev(ok.SENDTIME_num) 1)
) v
)
where cnt_errs=0
It returns:
INSP_EQP SENDTIME_NUM CLS CNT N_FIRST N_LAST SENDTIME SENDTIME_NOW CNT_ERRS
---------- ------------ ----- ---------- ---------- ---------- -------- ------------ ----------
Item B 0 OK 1 0 0 00h 10 0
Item B 1 OK 2 0 1 01h 10 0
Item B 2 OK 3 0 2 02h 10 0
Item B 3 OK 4 0 3 03h 10 0
Item B 4 OK 5 0 4 04h 10 0
Item B 5 OK 6 0 5 05h 10 0
Item B 6 OK 7 0 6 06h 10 0
Item B 7 OK 8 0 7 07h 10 0
Item B 8 OK 9 0 8 08h 10 0
Item B 9 OK 10 0 9 09h 10 0
Item B 10 OK 11 0 10 10h 10 0
11 rows selected.
CodePudding user response:
finally based on 1 answer in this post i was able to get the result i wanted. but there seems to be something wrong that makes me no longer see that person's reply. anyway thank you so much
select INSP_EQP
,SENDTIME
,CHAMBER
,AREA
,SENDTIME_NOW
from (
select INSP_EQP
,CHAMBER
,AREA
,SENDTIME
,SENDTIME_NOW
,SENDTIME_INT
,rn
,pre_2
,SENDTIME_INT-2
,case when RN = 1 and SENDTIME_INT-2 = pre_2 then 1 ELSE null END match
,count(case when RN = 1 and SENDTIME_INT-2 = pre_2 then 1 ELSE null end) over(partition by INSP_EQP) as chk
from (
select INSP_EQP
,SENDTIME
, SENDTIME_INT
,CHAMBER
,AREA
,SENDTIME_NOW
,row_number() over (partition by INSP_EQP order by SENDTIME_INT desc) as rn
,lag(SENDTIME_INT, 2) over(partition by INSP_EQP order by SENDTIME_INT) as pre_2
from t
) t
) t
where chk = 1
order by SENDTIME_INT
)
GROUP BY INSP_EQP
,SENDTIME_NOW
,CHAMBER
,AREA
ORDER BY COUNT(SENDTIME)