Home > Software engineering >  calculate each in and out duration for an id
calculate each in and out duration for an id

Time:04-29

I've data like below

Id date time type
1 01-01-2022 08:00 in
1 01-01-2022 11:30 out
1 01-01-2022 11:35 out
1 01-01-2022 12:45 in
1 01-01-2022 17:30 out
1 01-01-2022 01:00 out

expected output :

Id start end totaltime
1 08:00 11:35 03:35:00
1 12:45 17:30 04:45:00

where date is of DATE and time is of VARCHAR type columns. I want to calculate, all in/out duration for an Id. I am not able to think logic how get duration for all in/out duration for this.

Can anyone suggest any ideas ?

CodePudding user response:

It's a typical gaps-and-islands type of question, and one option to resolve would be conditionally using SUM() OVER () analytic function such as

WITH t1 AS
(
 SELECT t.*,
        TO_TIMESTAMP(TO_CHAR("date",'yyyy-mm-dd ')||time,'yyyy-mm-dd hh24:mi:ss') AS dt
   FROM t --> your table
), t2 AS
(   
  SELECT t1.*, 
         SUM(CASE WHEN type = 'in' THEN 1 ELSE 0 END) OVER (PARTITION BY id ORDER BY dt) AS rn_in,
         SUM(CASE WHEN type = 'out' THEN 1 ELSE 0 END) OVER (PARTITION BY id ORDER BY dt) AS rn_out,
         CASE WHEN type = 'in' THEN dt END AS "in",
         CASE WHEN type = 'out' THEN dt END AS "out"  
    FROM t1
) 
SELECT id, MIN("in") AS "start", MAX("out") AS "end", MAX("out")-MIN("in") AS "totaltime"
  FROM t2
 WHERE rn_in >= 1 
 GROUP BY id, rn_in
 ORDER BY "start"

where "date" is considered to be a date type column, not an ordinary string as commented.

Demo

CodePudding user response:

In Oracle, a DATE data type is a binary data type consisting of 7-bytes representing: century, year-of-century, month, day, hour, minute and second and it ALWAYS has those components. Given that, there is no point in having separate date and time columns and you can combine your two columns into one and have the sample data:

CREATE TABLE table_name (Id, datetime, type) AS
SELECT 1, DATE '2022-01-01'   INTERVAL '08:00' HOUR TO MINUTE, 'in'  FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01'   INTERVAL '11:30' HOUR TO MINUTE, 'out' FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01'   INTERVAL '11:35' HOUR TO MINUTE, 'out' FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01'   INTERVAL '12:45' HOUR TO MINUTE, 'in'  FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01'   INTERVAL '17:30' HOUR TO MINUTE, 'out' FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01'   INTERVAL '01:00' HOUR TO MINUTE, 'out' FROM DUAL;

Note: Another option to create the sample data is to use TO_DATE('2022-01-01 12:45', 'YYYY-MM-DD HH24:MI').

Then, from Oracle 12, you can use MATCH_RECOGNIZE to perform row-by-row operations on the data:

SELECT m.*,
       (end_dt - start_dt) DAY TO SECOND AS duration
FROM   table_name
MATCH_RECOGNIZE (
  PARTITION BY id
  ORDER     BY datetime
  MEASURES
    FIRST(ins.datetime) AS start_dt,
    LAST(outs.datetime) AS end_dt
  PATTERN (ins  outs )
  DEFINE
    ins  AS type = 'in',
    outs AS type = 'out'
) m

Which outputs:

ID START_DT END_DT DURATION
1 2022-01-01 08:00:00 2022-01-01 11:35:00 00 03:35:00.000000
1 2022-01-01 12:45:00 2022-01-01 17:30:00 00 04:45:00.000000

If you do keep separate date and time columns (you should not) then you can combine them into a single column before using MATCH_RECOGNIZE:

SELECT m.*,
       (end_dt - start_dt) DAY TO SECOND AS duration
FROM   (
  SELECT id,
         TO_DATE(TO_CHAR("DATE", 'YYYY-MM-DD') || time, 'YYYY-MM-DDHH24:MI') AS datetime,
         type
  FROM   table_name
)
MATCH_RECOGNIZE (
  PARTITION BY id
  ORDER     BY datetime
  MEASURES
    FIRST(ins.datetime) AS start_dt,
    LAST(outs.datetime) AS end_dt
  PATTERN (ins  outs )
  DEFINE
    ins  AS type = 'in',
    outs AS type = 'out'
) m

db<>fiddle here

CodePudding user response:

By one comment I assumed that every 'in' record got its matching 'out' record on the same day, so I solved it by getting the next 'in' on a day and finding max 'out' in between:

with t1 as 
(select ID, "date" , "time" as "start", LEAD("time",1,'24:00') over(partition by id, "date" order by  "time") as "next"
 from test 
 where "type" = 'in'),
t3 as
(select ID,  "start", (select max("time") from test t2
                        where t1.id = t2.id and t1."date" = t2."date" 
                          and t2."type" = 'out' and t2."time" between t1."start"
                          and t1."next" ) as "end"
from t1)
select ID, "start", "end", (case when "end" is null then null else (TO_DSINTERVAL('0 '||"end"||':00') - TO_DSINTERVAL('0 '||"start"||':00'))  end) as totaltime
from t3 
  • Related