Given this table, I'm trying to select all of ids that have an overlapping start_time and end_time grouped by ID. In this case, the table has multiple Ids that may or may not have multiple entries. (In this case, id's 1 and 2 don't have multiple rows, whereas 0 does.) Is there a way to retrieve all Ids that have an overlapping start and end time in Sql?
Example: Given this table, design some query to yield the subsequent table.
id | start time | end time |
---|---|---|
0 | 2022-06-10 12:44:55 | 2022-06-10 12:46:55 |
1 | 2022-06-10 12:47:55 | 2022-06-10 12:48:55 |
2 | 2022-06-10 12:49:00 | 2022-06-10 12:50:00 |
0 | 2022-06-10 12:45:55 | 2022-06-10 12:48:55 |
Id's with timestamp overlaps |
---|
0 |
CodePudding user response:
With SRC AS (
SELECT 0 id, '2022-06-10 12:44:55' start_time, '2022-06-10 12:46:55' end_time FROM DUAL UNION ALL
SELECT 1, '2022-06-10 12:47:55', '2022-06-10 12:48:55' FROM DUAL UNION ALL
SELECT 2, '2022-06-10 12:49:00', '2022-06-10 12:50:00' FROM DUAL UNION ALL
SELECT 0, '2022-06-10 12:45:55', '2022-06-10 12:48:55' FROM DUAL )
SELECT distinct A.id
FROM SRC A
CROSS JOIN SRC B
on (B.start_time between A.start_time and A.end_time
OR B.end_time between A.start_time and A.end_time)
AND (A.start_time <> B.start_Time
OR B.end_time <> B.end_time)
AND A.id = B.id
Giving us:
-----
| Aid |
-----
| 0 |
-----
CodePudding user response:
There are multiple ways of doing this, but the basic idea would be something like:
WITH mytable AS
(
SELECT row_number() OVER () rid, id, starttime, endtime
FROM myrealtable
)
SELECT distinct id
FROM mytable mt1
WHERE exists
(SELECT *
FROM mytable mt2
WHERE mt2.id = mt1.id
AND mt2.starttime >= mt1.starttime and mt2.starttime < mt1.endtime)
OR exists
(SELECT *
FROM mytable mt2
WHERE mt2.id = mt1.id
AND mt2.endtime > mt1.starttime and mt2.endtime <= mt1.endtime)
Or:
WITH mytable AS
(
SELECT row_number() OVER () rid, id, starttime, endtime
FROM myrealtable
)
SELECT distinct mt1.id
FROM mytable mt1 INNER JOIN mytable mt2
ON mt2.id = mt1.id and mt2.rid <> mt1.rid
WHERE (mt2.starttime >= mt1.starttime and mt2.starttime < mt1.endtime)
OR (mt2.endtime > mt1.starttime and mt2.endtime <= mt1.endtime)
The CTE is just to get a unique id (rid
) from row_number
to ensure we don't match the row with itself. You might have an alternate way of doing this (e.g. maybe you know all the columns can't be the same).
If you want to handle the times inclusively, you could use BETWEEN
to make the conditions a little simpler, but I assume 10:00-11:00 and 11:00-11:30 should not be considered overlaps.
I listed two different ways since I think the first method (with EXISTS
) is very human-readable and easy to understand what it's doing. The second method (with a JOIN
) is probably better (shorter and might end up optimized better) and should make sense if you understand the first example.