Home > Enterprise >  Enumerate table rows by diff between timestamps
Enumerate table rows by diff between timestamps

Time:09-01

I've got the following ordered by user_id, timestamp table:

user_id timestamp
1000 1661919816
1000 1661919861
1001 1661919816

I need to enumerate this table with session id by the following rules:

  1. When diff between two timestamps is less than 1 minute, then it is the same session
  2. When diff is more than 1 minute, then it is previous session_id 1
  3. When user_id changes, so does session_id (again previous session_id 1)

So in my case I want something like this:

user_id timestamp session_id
1000 1661919816 1
1000 1661919861 1
1001 1661919816 2

I need something like dense_rank(), however I don't know how to apply it to diff between two timestamps

CodePudding user response:

You can use the LAG window function to parse the previous user_id and timestamp values, then you can check if the user_id value has changed, or if it is the same, check if the timestamp difference is greater than 1 minute (60 seconds). You can then use the SUM window function to calculate the value of session_id.

Such a query might look like this

SELECT 
    user_id,
    timestamp,
    SUM(flag) OVER (ORDER BY user_id, timestamp) session_id
FROM (    
SELECT
    *,
    CASE WHEN user_id = LAG(user_id) OVER (ORDER BY user_id, timestamp)
              AND timestamp - LAG(timestamp, 1, 0) OVER (ORDER BY user_id, timestamp) < 60
         THEN 0 -- user_id is not changed and timestamp difference is smaller than 1 minute, do not increment session_id
         ELSE 1 -- increment session_id since user_id is changed or timestamp difference is greater than 1 minute
    END flag
FROM sessions
) flags

For your sample data, query output

user_id timestamp session_id
1000 1661919816 1
1000 1661919861 1
1001 1661919816 2

For extended sample data like this

user_id timestamp
1000 1661919816
1000 1661919861
1000 1661919871
1000 1661919931
1001 1661919816
1001 1661919896
1001 1661919906

query output

user_id timestamp session_id
1000 1661919816 1
1000 1661919861 1
1000 1661919871 1
1000 1661919931 2
1001 1661919816 3
1001 1661919896 4
1001 1661919906 4

db<>fiddle here

  •  Tags:  
  • sql
  • Related