Home > Software design >  Find highest-ranked pieces of intersecting rows in SQL
Find highest-ranked pieces of intersecting rows in SQL

Time:12-13

This probably falls into that "I could google it if only I knew the proper name for it" category, but here goes.

I have a table of intersecting activity timeframes, and a ranking for those activities. I need to get remaining timeframes for each row after each higher-ranked intersection has been removed.

I've tried joining each row with each intersecting row, but adjusting the start and end times based on each individual intersection doesn't account for higher-level intersections. Here's what I got to before I decided I'm probably approaching this wrong conceptually.

select
    Activity
    ,rank
    ,case
        when b.startTime <= a.startTime then b.endTime 
        when b.endtime >= a.endTime then b.starttime
        else a.startTime
        end as starttimeAdj
    ,case
        when b.starttime <= b.startTime then a.endtime
        when b.endtime >= a.endTime then b.starttime
        else a.endTime
        end as endtimeAdj
from myTable as a
left join myTable as b
    on b.startTime< a.endTime
    and b.endTime> a.startTime
    and b.Activity!= a.Activity
    and b.Rank<a.rank
where
    b.segID is null or
    not (b.starttime <= a.startTime and b.endtime >= a.endTime)
order by a.starttime,a.rank,b.starttime,b.rank

Here's an example. This is the starting data:

Activity    rank    StartTime   EndTime
Meeting     1       8:00        9:00
Startup     2       8:00        8:10
Shift       4       8:00        19:00
Break1      3       10:15       10:30
Break2      3       17:00       17:15

This is what I'm trying to get to:

Activity    rank    StartTime   EndTime
Meeting     1       8:00        9:00
Shift       4       9:00        10:15
Break1      3       10:15       10:30
Shift       4       10:30       17:00
Break2      3       17:00       17:15
Shift       4       17:15       19:00

The Startup activity is gone because it is fully encompassed by a higher-ranked event. The Shift activity, as the lowest rank, has been fragmented by each thing that intersects it, with only the non-intersecting periods remaining.

Visual representation by rank:

<--Meeting-->
<Startup>           
                      <Break1>                                            <Break2>
<-----------------------------------------------Shift------------------------------------>

Becomes

<--Meeting--><--Shift-><Break><-----------------Shift--------------------><Break><-Shift->

CodePudding user response:

I'll bite. Here's something to try. I treated the leading edge and trailing edge of each activity slightly differently, since we want to include the current activity when handling the leading edge results, and exclude the current activity when handling the trailing edge results.

CTE term Description
edge1 Leading edge start time rank of overlapping rows
edge2 Trailing edge end time rank of overlapping rows
xall Just UNION the leading and trailing edge times of the best ranked rows
xprune Determine which adjacent rows (having the same segID) can be pruned
final Reduce to the final set of rows while calculating the new end times

The last query expression selects the columns of interest and removes the rows which indicate gaps in activity.

Fiddle for SQL Server

The SQL:

WITH edge1 AS (
        SELECT t2.Activity
             , t2.rank
             , t1.StartTime
             , t2.segID
             , ROW_NUMBER() OVER (PARTITION BY t1.StartTime ORDER BY t2.rank) AS rnk
          FROM      segments AS t1
          LEFT JOIN segments AS t2
            ON t1.StartTime >= t2.StartTime
           AND t1.StartTime <  t2.EndTime
     )
   , edge2 AS (
        SELECT t2.Activity
             , t2.rank
             , t1.EndTime
             , t2.segID
             , ROW_NUMBER() OVER (PARTITION BY t1.EndTime ORDER BY t2.rank) AS rnk
          FROM      segments AS t1
          LEFT JOIN segments AS t2
            ON t1.EndTime > t2.StartTime
           AND t1.EndTime < t2.EndTime
           AND t1.segID <> t2.segID
     )
   , xall AS (
        SELECT * FROM edge1 WHERE rnk = 1
         UNION
        SELECT * FROM edge2 WHERE rnk = 1
     )
   , xprune AS (
        SELECT *
             , CASE WHEN LAG(segID) OVER (ORDER BY StartTime) = segID THEN 1 ELSE 0 END AS prune
          FROM xall
     )
   , final AS (
        SELECT *
             , LEAD(StartTime) OVER (ORDER BY StartTime) AS EndTime
          FROM xprune
         WHERE prune = 0
     )
SELECT Activity, rank, StartTime, EndTime
  FROM final
 WHERE Activity IS NOT NULL
 ORDER BY StartTime
;

The result:

 ---------- ------ ----------- --------- 
| Activity | rank | StartTime | EndTime |
 ---------- ------ ----------- --------- 
| Meeting  |    1 |       800 |     900 |
| Shift    |    4 |       900 |    1015 |
| Break1   |    3 |      1015 |    1030 |
| Shift    |    4 |      1030 |    1700 |
| Break2   |    3 |      1700 |    1715 |
| Shift    |    4 |      1715 |    1900 |
 ---------- ------ ----------- --------- 
  • Related