I am struggling to find an appropriate solution for this for a while. I have a table named sessions.
The table has 3 fields:
- session_id - represents a unique identification of the user's session
- timestamp - the time of when the user made an action
- action - either a login or logout
...
I track when the user logged in and when he logged out.
session_id | action | timestamp |
---|---|---|
1 | login | 2021-11-09 10:03 |
2 | login | 2021-11-09 10:15 |
1 | logout | 2021-11-09 10:48 |
3 | login | 2021-11-09 11:03 |
4 | login | 2021-11-09 11:43 |
5 | login | 2021-11-09 14:14 |
5 | logout | 2021-11-09 15:03 |
3 | logout | 2021-11-09 15:08 |
4 | logout | 2021-11-09 16:03 |
2 | logout | 2021-11-09 17:01 |
Essentially, I would like to have an output table, that would show how many concurrent sessions were there on each hour:
timestamp | concurrent_sessions |
---|---|
2021-11-09 10:00 | 2 |
2021-11-09 11:00 | 3 |
2021-11-09 12:00 | 3 |
2021-11-09 13:00 | 3 |
2021-11-09 14:00 | 4 |
2021-11-09 15:00 | 4 |
2021-11-09 16:00 | 2 |
2021-11-09 17:00 | 1 |
CodePudding user response:
First, create a table of timestamp ranges, then join with a table of timestamps, aggregate and count:
SELECT times.t, count(*)
FROM generate_series(
TIMESTAMP '2021-11-09 10:00:00',
TIMESTAMP '2021-11-09 17:00:00',
INTERVAL '1 hour'
) AS times(t)
LEFT JOIN (SELECT session_id,
tsrange(li.timestamp, lo.timestamp, '[]') AS range
FROM mytable AS li
LEFT JOIN mytable AS lo USING (session_id)
WHERE li.action = 'login'
AND lo.action = 'logout'
) AS q
ON q.range @> times.t
GROUP BY times.t;
ORDER BY times.t