I have a table with users, risk level, and timestamp of that risk. It looks like this:
user | risk | timestamp |
---|---|---|
Jim | high | 2022-01-01 |
Jim | low | 2022-01-02 |
Bob | low | 2022-01-03 |
Bob | high | 2022-01-05 |
Bob | low | 2022-01-07 |
Kev | high | 2022-01-08 |
Kev | low | 2022-01-10 |
Kev | high | 2022-01-15 |
Kev | low | 2022-01-23 |
I want to find how long it takes for a user to go from high
risk to low
, so it would look something like this:
user | high_timestamp | low_timestamp | duration |
---|---|---|---|
Jim | 2022-01-01 | 2022-01-02 | 1 day |
Bob | 2022-01-05 | 2022-01-07 | 2 day |
Kev | 2022-01-08 | 2022-01-10 | 2 day |
Kev | 2022-01-15 | 2022-01-23 | 8 day |
CodePudding user response:
You could accomplish this with window functions and FILTER
, but the way with the least code would be to do something like:
SELECT
"user",
timestamp AS high_timestamp,
(SELECT timestamp AS prior_high_timestamp
FROM t highs
WHERE highs."user"=t.user AND highs.event='high' AND highs.time<t.time
ORDER BY time DESC LIMIT 1)
FROM t
WHERE risk='low';
If performance is a concern I might test different options, however.
CodePudding user response:
You can use LEFT JOIN LATERAL
to get user rows with high
value in risk
column not followed by low
value at all. It would be like:
SELECT
high."user",
high.timestamp as high_timestamp,
low.timestamp as low_timestamp,
CAST(low.timestamp - high.timestamp AS TEXT) || ' day' as duration
FROM user_risk high
LEFT JOIN LATERAL (
SELECT
"user",
timestamp
FROM user_risk
WHERE risk = 'low' AND "user" = high."user" AND timestamp >= high.timestamp
ORDER BY timestamp
LIMIT 1
) low ON TRUE
WHERE risk = 'high'
Please, check working demo at https://www.db-fiddle.com/f/gRhrkRrvLfBvYGXFAVU7Vc/1.
I've added one more row at the end ('Kev', 'high', '2022-01-25')
, that is not followed by low
risk value to demonstrate the case