I have some data that I'm trying to bucket. Let's say the data has an user and timestamp. I want to define a session as any rows that has a timestamp within 10 minutes of the previous timestamp by user.
How would I go about this in SQL?
Example
------ --------------------- ---------
| user | timestamp | session |
------ --------------------- ---------
| 1 | 2021-05-09 15:12:52 | 1 |
| 1 | 2021-05-09 15:18:52 | 1 | within 10 min of previous timestamp
| 1 | 2021-05-09 15:32:52 | 2 | over 10 min, new session
| 2 | 2021-05-09 16:00:00 | 1 | different user
| 1 | 2021-05-09 17:00:00 | 3 | new session
| 1 | 2021-05-09 17:02:00 | 3 |
------ --------------------- ---------
This will give me records within 10 minutes but how would I bucket them like above?
with cte as (
select user,
timestamp,
lag(timestamp) over (partition by user order by timestamp) as last_timestamp
from table
)
select *
from cte
where datediff(mm, last_timestamp, timestamp) <= 10
CodePudding user response:
Try this one. It's basically an edge problem.
Setup:
CREATE TABLE table1 (user1 int, timestamp1 datetime, session1 int);
INSERT INTO table1 VALUES
( 1 , '2021-05-09 15:12:52' , 1 )
, ( 1 , '2021-05-09 15:18:52' , 1 ) -- within 10 min of previous timestamp
, ( 1 , '2021-05-09 15:32:52' , 2 ) -- over 10 min, new session
, ( 2 , '2021-05-09 16:00:00' , 1 ) -- different user
, ( 1 , '2021-05-09 17:00:00' , 3 ) -- new session
, ( 1 , '2021-05-09 17:02:00' , 3 )
;