I define a window here as a tuple which has 2 items - the start and end (respectively) indices of an object. So the tuple (0,3)
means that the window (0,3)
has start_index = 0
and end_index=3
Sample list of 6 windows
[(0,1) , (0,3) , (4,9) , (5,9), (6,8) , (11,13)]
Desired output:
[(0, 3), (4, 9), (11, 13)]
As you can see , in the output the no. of windows have been reduced to 3 compared to 6 windows in the input -- and no index in the middle of any window is "lost".
I have given it a shot myself but am not finding the desired results as my output contains more no. of windows than desired and I have to feed that output again in my "script" to get to the final stage.. It's basically converging windows wherever possible (when overlaps btw windows happen in terms of the indices in the window)
Here's my code:
list_ = [(0,1) , (0,3) , (4,9) , (5,9), (6,8) , (11,13)]
#creating a copy of list_ as I need to remove items when I iterate..
list_copy = list(list_)
new_list = []
for i in list_:
for j in list_:
if i != j:
if (i[0] >= j[0]) and (i[1] <= j[1]):
new_list.append(j)
if i in list_copy:
list_copy.remove(i)
if j in list_copy:
list_copy.remove(j)
output = list(set(new_list list_copy))
print("list_copy:",list_copy)
print("new_list:",new_list)
print("output:",output)
Results(output
is the variable for storing my result) for my code are as below , which has the window (5,9)
unnecessarily. It gets removed if I rerun my piece of code for the output I have received from the 1st run.
list_copy: [(11, 13)]
new_list: [(0, 3), (4, 9), (4, 9), (5, 9)]
output: [(5, 9), (0, 3), (4, 9), (11, 13)]
CodePudding user response:
Algorithm
- Sort the original list by size
- result initially empty
- For each element in sorted list
- merge with element in result that has an overlap (if any)
- append element to result if no overlap
Code
def merge_ranges(ranges):
def overlap(needle, stack):
'''
Check if interval needle is overlaps with stack)
'''
needle_s, needle_e = needle
stack_s, stack_e = stack
return needle_e >= stack_s and needle_s <= stack_e
def merge(needle, stack):
needle_s, needle_e = needle
stack_s, stack_e = stack
return (min(stack_s, needle_s), max(stack_e, needle_e))
# Sort by length in non-ascending order
# Makes it so we start by keeping the longer intervals
ranges = sorted(ranges, key = lambda r: [-(r[1] - r[0]), r[0], r[1]])
result = []
for r in ranges:
for i, x in enumerate(result):
if overlap(r, x):
result[i] = merge(x, r)
break
else:
result.append(r)
return sorted(result, key = lambda t: t[0]) # return sorted by first element
Tests
print(merge_ranges([(0,1) , (0,3)]))
# Output: [(0, 3)]
print(merge_ranges([(0,1) , (0,3) , (4,9) , (5,9), (6,8) , (11,13)]))
# Output: [(0, 3), (4, 9), (11, 13)]
print(merge_ranges([(0,1) , (0,3) , (4,9) , (5,9), (6,8) , (11,13), (0, 100)]))
# Output: [(0, 100)]
print(merge_ranges([(0, 1), (0, 3), (4, 9), (5, 9), (5, 15), (6, 8), (11, 13)]))
# Output: [(0, 3), (4, 15)]
CodePudding user response:
This seems to work with OP's data:
windows = [(0, 1), (0, 3), (4, 9), (5, 9), (0, 100), (6, 8), (11, 13)]
output = []
def merge(a, b, output):
for i, (x, y) in enumerate(output):
if a >= x and b <= y:
return
if (a == x and b > y) or (a < x and b == y):
output[i] = (a, b)
return
output.append((a, b))
for a, b in windows:
merge(a, b, output)
kr = set()
for outer in output:
for k, inner in enumerate(output):
if outer != inner and inner[0] >= outer[0] and inner[1] <= outer[1]:
kr.add(k)
for i in sorted(kr)[::-1]:
output.pop(i)
print(output)
Output:
[(0, 100)]
CodePudding user response:
Continuing on my comment on @DarrylG's answer to help with the possible improvement to the final output.
In @DarrylG' Approach 1, if the original list of windows is is [(0, 1), (0, 3), (4, 9), (5, 9), (5, 15), (6, 8), (11, 13)]
, the output comes as [(5, 15), (4, 9), (0, 3)]
. As you can observe the 0th and 1st item in the output i.e. (5, 15), (4, 9)
still have overlaps.
We could modify this output to prevent any overlap among windows by merging (5, 15), (4, 9)
into (4, 15)
. Hence the final output will then be [(4, 15), (0, 3)]
Here's my code that merges the (5, 15), (4, 9)
into (4, 15)
Functions to run on Approach 1's Output..
def get_merge_windows(list_):
new_list_= []
for i in list_:
intersection_list_ = [ (i,j, bool( set(range(i[0],i[1] 1)).intersection(set(range(j[0],j[1] 1))) ) ) for j in list_]
intersection_list_ = [item for item in intersection_list_ if (item[2] == True) and (item[0] != item[1])]
if intersection_list_:
#it's neccasary to add i's start index in choices otherwise results may be wrong..
start_index_choices = [item[1][0] for item in intersection_list_] [i[0]]
start = min(start_index_choices)
#it's neccasary to add i's end index in choices otherwise results may be wrong..
end_index_choices = [item[1][1] for item in intersection_list_] [i[1]]
end = max(end_index_choices)
new_list_.append((start,end))
new_list_ = list(set(new_list_))
if new_list_ == []:
return list_
return new_list_
def run_multiple_times_merge_windows(list_):
'''
This iterative run ensures everything is converged. For example below list does not get converged in 1 go of the get_merge_windows function
example = [(5, 15), (4, 9), (0, 3),(4,16) , (1,95) , (2,100)]
get_merge_windows(example) returns [(1, 100), (0, 100)] instead of [(0,100)] and hence we require another run of get_merge_windows over the output
we got with the first run..
'''
new_list_ = []
#100 is just an arbitrary number , the iterations should stop in 3-4 loops itself..
for i in range(100):
if i>0 and (new_list_ == get_merge_windows(new_list_)):
break
elif i==0:
new_list_ = get_merge_windows(list_)
else:
new_list_ = get_merge_windows(new_list_)
return new_list_
def get_final_windows(list_,new_list_):
#add combined windows with windows windows that couldn't be merged..
new_list__2 = list(new_list_)
for i in list_:
bool_list_t = [bool( set(range(i[0],i[1] 1)).intersection(set(range(j[0],j[1] 1))) ) for j in new_list_]
if any(bool_list_t) == True:
continue
else:
new_list__2.append(i)
return new_list__2
Testing
test_list_ = [(5, 15), (4, 9), (0, 3)]
all_merged_windows = run_multiple_times_merge_windows(test_list_)
print("all_merged_windows:",all_merged_windows)
final_windows = get_final_windows(test_list_,all_merged_windows)
print("final_windows:",final_windows)
Results
all_merged_windows:[(4, 15)]
final_windows: [(4, 15), (0, 3)]
Please note that I have had to run the get_merge_windows
multiple times due to an edge case scenario in the example presented in the doc string of run_multiple_times_merge_windows
. But this is not a deal breaker as I have timed it and it runs 15k times per second for that example..