How to converge n windows(of start and end indices) to the least possible no. of windows such that a-CodePudding

I define a window here as a tuple which has 2 items - the start and end (respectively) indices of an object. So the tuple (0,3) means that the window (0,3) has start_index = 0 and end_index=3

Sample list of 6 windows

[(0,1) , (0,3) , (4,9) , (5,9), (6,8) , (11,13)]

Desired output:

[(0, 3), (4, 9), (11, 13)]

As you can see , in the output the no. of windows have been reduced to 3 compared to 6 windows in the input -- and no index in the middle of any window is "lost".

I have given it a shot myself but am not finding the desired results as my output contains more no. of windows than desired and I have to feed that output again in my "script" to get to the final stage.. It's basically converging windows wherever possible (when overlaps btw windows happen in terms of the indices in the window)

Here's my code:

list_ = [(0,1) , (0,3) , (4,9) , (5,9), (6,8) , (11,13)]

#creating a copy of list_ as I need to remove items when I iterate..
list_copy = list(list_)

new_list = []

for i in list_:
    for j in list_:
        if i != j:
            if (i[0] >= j[0]) and (i[1] <= j[1]):
                
                new_list.append(j)
                
                if i in list_copy:
                    list_copy.remove(i)
                if j in list_copy:                    
                    list_copy.remove(j)

output  = list(set(new_list   list_copy))

print("list_copy:",list_copy)
print("new_list:",new_list)    
print("output:",output)

Results(output is the variable for storing my result) for my code are as below , which has the window (5,9) unnecessarily. It gets removed if I rerun my piece of code for the output I have received from the 1st run.

list_copy: [(11, 13)]
new_list: [(0, 3), (4, 9), (4, 9), (5, 9)]
output: [(5, 9), (0, 3), (4, 9), (11, 13)]

CodePudding user response：

Algorithm

Sort the original list by size
result initially empty
For each element in sorted list
- merge with element in result that has an overlap (if any)
- append element to result if no overlap

Code

def merge_ranges(ranges):
    def overlap(needle, stack):
        '''
            Check if interval needle is overlaps with stack)
        '''
        needle_s, needle_e = needle
        stack_s, stack_e = stack

        return needle_e >= stack_s and needle_s <= stack_e
    
    def merge(needle, stack):
        needle_s, needle_e = needle
        stack_s, stack_e = stack
        
        return (min(stack_s, needle_s), max(stack_e, needle_e))
    
    # Sort by length in non-ascending order
    # Makes it so we start by keeping the longer intervals
    ranges = sorted(ranges, key = lambda r: [-(r[1] - r[0]), r[0], r[1]])
   
    result = []
    for r in ranges:
        for i, x in enumerate(result):
            if overlap(r, x):
                result[i] = merge(x, r)
                break
        else:
            result.append(r)
            
    return sorted(result, key = lambda t: t[0])  # return sorted by first element

Tests

print(merge_ranges([(0,1) , (0,3)]))
# Output: [(0, 3)]

print(merge_ranges([(0,1) , (0,3) , (4,9) , (5,9), (6,8) , (11,13)]))
# Output: [(0, 3), (4, 9), (11, 13)]

print(merge_ranges([(0,1) , (0,3) , (4,9) , (5,9), (6,8) , (11,13), (0, 100)]))
# Output: [(0, 100)]

print(merge_ranges([(0, 1), (0, 3), (4, 9), (5, 9), (5, 15), (6, 8), (11, 13)]))
# Output: [(0, 3), (4, 15)]

CodePudding user response：

This seems to work with OP's data:

windows = [(0, 1), (0, 3), (4, 9), (5, 9), (0, 100), (6, 8), (11, 13)]
output = []


def merge(a, b, output):
    for i, (x, y) in enumerate(output):
        if a >= x and b <= y:
            return
        if (a == x and b > y) or (a < x and b == y):
            output[i] = (a, b)
            return
    output.append((a, b))

for a, b in windows:
    merge(a, b, output)

kr = set()

for outer in output:
    for k, inner in enumerate(output):
        if outer != inner and inner[0] >= outer[0] and inner[1] <= outer[1]:
            kr.add(k)

for i in sorted(kr)[::-1]:
    output.pop(i)

print(output)

Output:

[(0, 100)]

CodePudding user response：

Continuing on my comment on @DarrylG's answer to help with the possible improvement to the final output.

In @DarrylG' Approach 1, if the original list of windows is is [(0, 1), (0, 3), (4, 9), (5, 9), (5, 15), (6, 8), (11, 13)] , the output comes as [(5, 15), (4, 9), (0, 3)]. As you can observe the 0th and 1st item in the output i.e. (5, 15), (4, 9) still have overlaps.

We could modify this output to prevent any overlap among windows by merging (5, 15), (4, 9) into (4, 15) . Hence the final output will then be [(4, 15), (0, 3)]

Here's my code that merges the (5, 15), (4, 9) into (4, 15)

Functions to run on Approach 1's Output..

def get_merge_windows(list_):

    new_list_= []

    for i in list_:

        intersection_list_ = [ (i,j, bool( set(range(i[0],i[1] 1)).intersection(set(range(j[0],j[1] 1))) ) ) for j in list_]
        intersection_list_ = [item for item in intersection_list_ if (item[2] == True) and (item[0] != item[1])]

        if intersection_list_:

            #it's neccasary to add i's start index in choices otherwise results may be wrong..            
            start_index_choices = [item[1][0] for item in intersection_list_]   [i[0]]
            start = min(start_index_choices)
            
            #it's neccasary to add i's end index in choices otherwise results may be wrong..       
            end_index_choices = [item[1][1] for item in intersection_list_]   [i[1]]
            end = max(end_index_choices)

            new_list_.append((start,end))
            new_list_ = list(set(new_list_))

    if new_list_ == []:
        return list_

    return new_list_

def run_multiple_times_merge_windows(list_):
    '''
        This iterative run ensures everything is converged. For example below list does not get converged in 1 go of the get_merge_windows function
        example =  [(5, 15), (4, 9), (0, 3),(4,16) , (1,95) , (2,100)]
        get_merge_windows(example) returns [(1, 100), (0, 100)] instead of [(0,100)] and hence we require another run of get_merge_windows over the output
        we got with the first run..
    '''
    new_list_ = []

    #100 is just an arbitrary number , the iterations should stop in 3-4 loops itself..
    for i in range(100):

        
        if i>0 and (new_list_ == get_merge_windows(new_list_)):
            break

        elif i==0:
            new_list_ = get_merge_windows(list_)

        else:
            new_list_ = get_merge_windows(new_list_)

    return new_list_

def get_final_windows(list_,new_list_):
    #add combined windows with windows windows that couldn't be merged..
    
    new_list__2 = list(new_list_)

    for i in list_:

        bool_list_t = [bool( set(range(i[0],i[1] 1)).intersection(set(range(j[0],j[1] 1))) ) for j in new_list_]

        
        if any(bool_list_t) == True:
            continue

        else:
            new_list__2.append(i)

    return new_list__2

Testing

test_list_ = [(5, 15), (4, 9), (0, 3)]
all_merged_windows = run_multiple_times_merge_windows(test_list_)
print("all_merged_windows:",all_merged_windows)
final_windows = get_final_windows(test_list_,all_merged_windows)
print("final_windows:",final_windows)

Results

all_merged_windows:[(4, 15)]
final_windows: [(4, 15), (0, 3)]

Please note that I have had to run the get_merge_windows multiple times due to an edge case scenario in the example presented in the doc string of run_multiple_times_merge_windows. But this is not a deal breaker as I have timed it and it runs 15k times per second for that example..