I have a table consisting of actions that are logged when users are viewing files. When a user views a file, this can result in multiple actions over time. As multiple users can be active at the same time, their actions can become intertwined.
I am looking to create an additional column in my query that tells me when a user started viewing this file (let's call this StartId).
When a user stops viewing a file to view some other file, and then goes back to viewing the first file, this should be regarded as a new viewing session.
This table illustrates my problem and my desired result:
Id | User | File | StartId (desired result) |
---|---|---|---|
1 | A | X | 1 |
2 | A | X | 1 |
3 | B | Y | 3 |
4 | A | X | 1 |
5 | B | Y | 3 |
6 | A | Y | 6 |
7 | A | X | 7 |
The closest I have come is with this line:
StartId = FIRST_VALUE(Id) OVER (PARTITION BY User, File ORDER BY Id)
However, this has the following result for the last action in the example:
Id | User | File | StartId |
---|---|---|---|
7 | A | X | 1 |
Can someone point me in the right direction with this?
CodePudding user response:
you can use LAG
to see if the previous file for the same user was the same as the current file for that user and so whether we are in a new session or not and then use that result accordingly.
WITH T AS
(
SELECT *,
CASE WHEN "File" = LAG("File") OVER (PARTITION BY "User" ORDER BY "Id") THEN NULL ELSE "Id" END AS NewSessionFlag
FROM YourTable
)
SELECT *,
MAX(NewSessionFlag) OVER (PARTITION BY "User" ORDER BY "Id" ROWS UNBOUNDED PRECEDING)
FROM T
ORDER BY "Id"