I am working with a list of lists where some items have overlapping values.
list=[[7, 11, 'Feature01'],
[2, 6, 'Feature02'],
[31, 59, 'Feature03'],
[31, 41, 'Feature04'],
[20, 40, 'Feature05'],
[25, 30, 'Feature06']
For example, in the below items Feature04 lies inside Feature03 coordinates.
[31, 59, 'Feature03'], [31, 41, 'Feature04'],
Similarly in the below example, Feature06 lies inside Feature05 coordinates.
[20, 40, 'Feature05'], [25, 30, 'Feature06']
I want to retain only one item in such an overlapping scenario and update the original/master list of lists to save non-overlapping list items.
I found almost similar question but could not get it to work.
Thanks in advance
CodePudding user response:
Sort the list by the starting point of the interval, tiebreaking based on endpoints of intervals (in descending order). Then, you add an interval if it isn't contained by the most recently added interval to the result:
lst.sort(key=lambda x: (x[0], -x[1]))
result = [lst[0]]
for entry in lst[1:]:
current_start, current_end, _ = entry
last_start, last_end, _ = result[-1]
if not (last_start <= current_start <= current_end <= last_end):
result.append(entry)
print(result)
This outputs:
[[2, 6, 'Feature02'], [7, 11, 'Feature01'], [20, 40, 'Feature05'], [31, 59, 'Feature03']]