Calculating Time Difference between Each Users Attempt-CodePudding

I have table in my Oracle database (v12.2.0.2.1) that tracks a users attempt to register an event and the timestamp of that attempt. The question I am trying to answer is the time difference between each users attempt.

The original table looks something like this:

USER_ACQUISITION_ATTEMPT_ID	USER_ID	RECORDING_DATE_ATTEMPT
1	189590	2022-OCT-12 14:57:54
2	189590	2022-OCT-12 14:58:15
3	189590	2022-OCT-12 15:01:50
4	189590	2022-OCT-12 15:02:11
5	189590	2022-OCT-12 15:02:40
6	189590	2022-OCT-12 15:03:51
7	22809	2022-OCT-12 15:04:32
8	189590	2022-OCT-12 15:09:05
9	189590	2022-OCT-12 15:10:16
10	189590	2022-OCT-12 15:10:24
11	189590	2022-OCT-12 15:10:42
12	189590	2022-OCT-12 15:13:00
13	178275	2022-OCT-12 15:45:22
14	180336	2022-OCT-12 15:53:25
15	180336	2022-OCT-12 15:53:50

Where each attempt is uniquely registered in order by the USER_ACQUISITION_ATTEMPT_ID. The USER_ID is the ID of the user that makes the attempt and the RECORDING_DATE_ATTEMPT is the timestamp.

I've created the following query that will show the time different between each attempt throughout the whole table using the LAG function to compare the dates from one row to the next and then just some math to calculate the time difference.

WITH dat AS 
  (
  SELECT
    a.USER_ACQUISITION_ATTEMPT_ID,  
    a.USER_ID, 
    a.RECORDING_DATE_ATTEMP DAT1,
    a.RECORDING_DATE_ATTEMP DAT2
  FROM 
    USER_ACQUISITION_ATTEMPT a

  UNION ALL

  SELECT
    c.USER_ACQUISITION_ATTEMPT_ID,
    c.USER_ID,    
    c.RECORDING_DATE_ATTEMP DAT1,
    c.RECORDING_DATE_ATTEMP DAT2
  FROM 
    USER_ACQUISITION_ATTEMPT c 
   )
   
SELECT
  USER_ACQUISITION_ATTEMPT_ID,
  USER_ID,
  DAT1,
  LAG(DAT1,1) OVER (ORDER BY DAT1) LAG,
  TO_CHAR(DATE '1900-01-01'   ABS (LAG(DAT1,1) OVER (ORDER BY DAT1) - DAT1), 'HH24:MI:SS') as TIME_DIFF
FROM dat
group by 
  USER_ACQUISITION_ATTEMPT_ID,
  USER_ID,  
  DAT1

Which yields something like this:

USER_ACQUISITION_ATTEMPT_ID	USER_ID	DAT1	LAG	TIME_DIFF
1	189590	2022-OCT-12 14:57:54	NULL	NULL
2	189590	2022-OCT-12 14:58:15	2022-OCT-12 14:57:54	00:00:21
3	189590	2022-OCT-12 15:01:50	2022-OCT-12 14:58:15	00:03:35
4	189590	2022-OCT-12 15:02:11	2022-OCT-12 15:01:50	00:00:21
5	189590	2022-OCT-12 15:02:40	2022-OCT-12 15:02:11	00:00:29
6	189590	2022-OCT-12 15:03:51	2022-OCT-12 15:02:40	00:01:11
7	22809	2022-OCT-12 15:04:32	2022-OCT-12 15:03:51	00:00:41
8	189590	2022-OCT-12 15:09:05	2022-OCT-12 15:04:32	00:04:33
9	189590	2022-OCT-12 15:10:16	2022-OCT-12 15:09:05	00:01:11
10	189590	2022-OCT-12 15:10:24	2022-OCT-12 15:10:16	00:00:08
11	189590	2022-OCT-12 15:10:42	2022-OCT-12 15:10:24	00:00:18
12	189590	2022-OCT-12 15:13:00	2022-OCT-12 15:10:42	00:02:18
13	178275	2022-OCT-12 15:45:22	2022-OCT-12 15:13:00	00:32:22
14	180336	2022-OCT-12 15:53:25	2022-OCT-12 15:45:22	00:08:03
15	180336	2022-OCT-12 15:53:50	2022-OCT-12 15:53:25	00:00:25

A mentioned, I want to look at each user individually. The above query doesn't do that. For example, in the result table you will see the first row has the very first attempt and its timestamp. Under the LAG and TIME_DIFF column, there are nulls because no other attempt came before it. Each subsequent row grabs the last DAT1 and places it in the next row under the LAG column. That works fine until the next user tries an attempt. Where you see the USER_ID change from 189590 on USER_ACQUISITION_ATTEMPT_ID #6 to USER_ID 22809 on USER_ACQUISITION_ATTEMPT_ID #7, the last time stamp from the previous user is used against the first timestamp of the next user which is not what I want. The first attempt for any given user should have NULL for both the LAG and TIME_DIFF columns so that the query can sort of reset the calculation of the time differences for each user.

I hope I have explained the issue clearly, however if clarification is needed, please add some comments.

CodePudding user response：

You don't need the UNION ALL in the common table expression. It's just duplicating the data, which you then use GROUP BY to de-duplicate it.

Removing the UNION ALL means that you can remove the GROUP BY.

Then just use PARTITION BY in the analytic functions.

SELECT
  USER_ACQUISITION_ATTEMPT_ID,
  USER_ID,
  RECORDING_DATE_ATTEMP,
  LAG(RECORDING_DATE_ATTEMP)
    OVER (
      PARTITION BY USER_ID
          ORDER BY RECORDING_DATE_ATTEMP
    )
      AS LAG_RECORDING_DATE_ATTEMP
FROM
  USER_ACQUISITION_ATTEMPT