I've data like below
Id | date | time | type |
---|---|---|---|
1 | 01-01-2022 | 08:00 | in |
1 | 01-01-2022 | 11:30 | out |
1 | 01-01-2022 | 11:35 | out |
1 | 01-01-2022 | 12:45 | in |
1 | 01-01-2022 | 17:30 | out |
1 | 01-01-2022 | 01:00 | out |
expected output :
Id | start | end | totaltime |
---|---|---|---|
1 | 08:00 | 11:35 | 03:35:00 |
1 | 12:45 | 17:30 | 04:45:00 |
where date
is of DATE
and time
is of VARCHAR
type columns. I want to calculate, all in/out duration for an Id. I am not able to think logic how get duration for all in/out duration for this.
Can anyone suggest any ideas ?
CodePudding user response:
It's a typical gaps-and-islands
type of question, and one option to resolve would be conditionally using SUM() OVER ()
analytic function such as
WITH t1 AS
(
SELECT t.*,
TO_TIMESTAMP(TO_CHAR("date",'yyyy-mm-dd ')||time,'yyyy-mm-dd hh24:mi:ss') AS dt
FROM t --> your table
), t2 AS
(
SELECT t1.*,
SUM(CASE WHEN type = 'in' THEN 1 ELSE 0 END) OVER (PARTITION BY id ORDER BY dt) AS rn_in,
SUM(CASE WHEN type = 'out' THEN 1 ELSE 0 END) OVER (PARTITION BY id ORDER BY dt) AS rn_out,
CASE WHEN type = 'in' THEN dt END AS "in",
CASE WHEN type = 'out' THEN dt END AS "out"
FROM t1
)
SELECT id, MIN("in") AS "start", MAX("out") AS "end", MAX("out")-MIN("in") AS "totaltime"
FROM t2
WHERE rn_in >= 1
GROUP BY id, rn_in
ORDER BY "start"
where "date"
is considered to be a date
type column, not an ordinary string as commented.
CodePudding user response:
In Oracle, a DATE
data type is a binary data type consisting of 7-bytes representing: century, year-of-century, month, day, hour, minute and second and it ALWAYS has those components. Given that, there is no point in having separate date and time columns and you can combine your two columns into one and have the sample data:
CREATE TABLE table_name (Id, datetime, type) AS
SELECT 1, DATE '2022-01-01' INTERVAL '08:00' HOUR TO MINUTE, 'in' FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01' INTERVAL '11:30' HOUR TO MINUTE, 'out' FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01' INTERVAL '11:35' HOUR TO MINUTE, 'out' FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01' INTERVAL '12:45' HOUR TO MINUTE, 'in' FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01' INTERVAL '17:30' HOUR TO MINUTE, 'out' FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01' INTERVAL '01:00' HOUR TO MINUTE, 'out' FROM DUAL;
Note: Another option to create the sample data is to use TO_DATE('2022-01-01 12:45', 'YYYY-MM-DD HH24:MI')
.
Then, from Oracle 12, you can use MATCH_RECOGNIZE
to perform row-by-row operations on the data:
SELECT m.*,
(end_dt - start_dt) DAY TO SECOND AS duration
FROM table_name
MATCH_RECOGNIZE (
PARTITION BY id
ORDER BY datetime
MEASURES
FIRST(ins.datetime) AS start_dt,
LAST(outs.datetime) AS end_dt
PATTERN (ins outs )
DEFINE
ins AS type = 'in',
outs AS type = 'out'
) m
Which outputs:
ID START_DT END_DT DURATION 1 2022-01-01 08:00:00 2022-01-01 11:35:00 00 03:35:00.000000 1 2022-01-01 12:45:00 2022-01-01 17:30:00 00 04:45:00.000000
If you do keep separate date and time columns (you should not) then you can combine them into a single column before using MATCH_RECOGNIZE
:
SELECT m.*,
(end_dt - start_dt) DAY TO SECOND AS duration
FROM (
SELECT id,
TO_DATE(TO_CHAR("DATE", 'YYYY-MM-DD') || time, 'YYYY-MM-DDHH24:MI') AS datetime,
type
FROM table_name
)
MATCH_RECOGNIZE (
PARTITION BY id
ORDER BY datetime
MEASURES
FIRST(ins.datetime) AS start_dt,
LAST(outs.datetime) AS end_dt
PATTERN (ins outs )
DEFINE
ins AS type = 'in',
outs AS type = 'out'
) m
db<>fiddle here
CodePudding user response:
By one comment I assumed that every 'in' record got its matching 'out' record on the same day, so I solved it by getting the next 'in' on a day and finding max 'out' in between:
with t1 as
(select ID, "date" , "time" as "start", LEAD("time",1,'24:00') over(partition by id, "date" order by "time") as "next"
from test
where "type" = 'in'),
t3 as
(select ID, "start", (select max("time") from test t2
where t1.id = t2.id and t1."date" = t2."date"
and t2."type" = 'out' and t2."time" between t1."start"
and t1."next" ) as "end"
from t1)
select ID, "start", "end", (case when "end" is null then null else (TO_DSINTERVAL('0 '||"end"||':00') - TO_DSINTERVAL('0 '||"start"||':00')) end) as totaltime
from t3