Not really an expert with SQL and im having problems figuring out how to do this one.
Got a table like this one:
ID | Message | TimeStamp | User |
---|---|---|---|
1 | Hello | 2022-08-01 10:00:00 | A |
1 | How are you? | 2022-08-01 10:00:05 | A |
1 | Hello there | 2022-08-01 10:00:10 | B |
1 | I am okay | 2022-08-01 10:00:12 | B |
1 | Good to know | 2022-08-01 10:00:15 | A |
1 | Bye | 2022-08-01 10:00:25 | B |
2 | Hello | 2022-08-01 10:02:50 | A |
2 | Hi | 2022-08-01 10:03:50 | B |
I need to calculate the time difference each time there is a response from the B user after a message from A.
Expected result would be like this
ID | Difference |
---|---|
1 | 5 |
1 | 10 |
2 | 60 |
Trying to use Lead function to obtain the next desired timestamp but im not getting the expected result Any tips or advice?
Thanks
CodePudding user response:
With mySQL 8.0:
WITH cte AS (
SELECT id, user, timestamp, ( LEAD(user) OVER (ORDER BY timestamp) ) AS to_user,
TIME_TO_SEC(TIMEDIFF(LEAD(timestamp) OVER (ORDER BY timestamp), timestamp)) AS time_diff
FROM msg_tab
)
SELECT id, time_diff
FROM cte
WHERE user='A' AND to_user IN ('B', NULL)
CodePudding user response:
Even if it is already answered - it's a nice use case for Vertica's MATCH() clause. Looking for a pattern consisting of sender = 'A'
followed by sender = 'B'
.
You get a pattern id, and then you can group by other stuff plus the pattern id to get max timestamp and min timestamp.
Also note that I renamed both "user" and "timestamp", as they are reserved words...
-- your input, don't use in final query
indata(ID,Message,ts,Usr) AS (
SELECT 1,'Hello' ,TIMESTAMP '2022-08-01 10:00:00','A'
UNION ALL SELECT 1,'How are you?',TIMESTAMP '2022-08-01 10:00:05','A'
UNION ALL SELECT 1,'Hello there' ,TIMESTAMP '2022-08-01 10:00:10','B'
UNION ALL SELECT 1,'I am okay' ,TIMESTAMP '2022-08-01 10:00:12','B'
UNION ALL SELECT 1,'Good to know',TIMESTAMP '2022-08-01 10:00:15','A'
UNION ALL SELECT 1,'Bye' ,TIMESTAMP '2022-08-01 10:00:25','B'
UNION ALL SELECT 2,'Hello' ,TIMESTAMP '2022-08-01 10:02:50','A'
UNION ALL SELECT 2,'Hi' ,TIMESTAMP '2022-08-01 10:03:50','B'
)
-- end of input, real query starts here , replace following comma with "WITH"
,
w_match_clause AS (
SELECT
*
, event_name()
, pattern_id()
, match_id()
FROM indata
MATCH (
PARTITION BY id ORDER BY ts
DEFINE
sentbya AS usr='A'
, sentbyb AS usr='B'
PATTERN
p AS (sentbya sentbyb)
)
-- ctl SELECT * FROM w_match_clause;
-- ctl ID | Message | ts | Usr | event_name | pattern_id | match_id
-- ctl ---- -------------- --------------------- ----- ------------ ------------ ----------
-- ctl 1 | How are you? | 2022-08-01 10:00:05 | A | sentbya | 1 | 1
-- ctl 1 | Hello there | 2022-08-01 10:00:10 | B | sentbyb | 1 | 2
-- ctl 1 | Good to know | 2022-08-01 10:00:15 | A | sentbya | 2 | 1
-- ctl 1 | Bye | 2022-08-01 10:00:25 | B | sentbyb | 2 | 2
-- ctl 2 | Hello | 2022-08-01 10:02:50 | A | sentbya | 1 | 1
-- ctl 2 | Hi | 2022-08-01 10:03:50 | B | sentbyb | 1 | 2
)
SELECT
id
, MAX(ts) - MIN(ts) AS difference
FROM w_match_clause
GROUP BY
id
, pattern_id
ORDER BY
id;
-- out id | difference
-- out ---- ------------
-- out 1 | 00:00:05
-- out 1 | 00:00:10
-- out 2 | 00:01