Compare each first sublist with the rest and group the second sublist from the list if they pass con-CodePudding

My input data is

mylist = [[[1820, 57, 1920, 237], [2, 3, 3]], [[1830, 90, 1920, 240], [2, 3, 4]], [[1450, 0, 1667, 197], [1, 3, 6]], [[1835, 90, 1920, 240], [0, 5, 7]],[[1452, 0, 1667, 197], [9, 9, 9]]]

I have a group of rectangle data with the first sublist containing its coordinates and the second one containing rgb data. I would like to group the second sublist if they overlap. Below is the code to check whether rectangle overlap.

def isRectangleOverlap(self, R1, R2):
    if (R1[0] >= R2[2]) or (R1[2] <= R2[0]) or (R1[3] <= R2[1]) or (R1[1] >= R2[3]):
        return False
    else:
        return True

And I have the following code:

for i in range(len(mylist)):
    for j in range(i   1, len(mylist)):
        if isRectangleOverlap(mylist[i][0], mylist[j][0]):
            ....

if they pass the condition, the second sublist will be grouped together.

My expected output:

[[[2, 3, 3], [2, 3, 4], [0, 5, 7]], [[1, 3, 6], [9, 9, 9]]]

Do you have any suggestions?

UPDATE: I solved the problem with answers below but I have another question that is it possible group RGB data of retangles that cluster in a small area like this ?

CodePudding user response：

It is an interesting question, because there are a few curve balls. For one, you are not looking for overlapping rectangles, but for clusters of rectangles. Your desired result implies that. I tried to be explicit in the code below:

mylist = [[[1820, 57, 1920, 237], [2, 3, 3]], [[1830, 90, 1920, 240], [2, 3, 4]], [[1450, 0, 1667, 197], [1, 3, 6]], [[1835, 90, 1920, 240], [0, 5, 7]],[[1452, 0, 1667, 197], [9, 9, 9]]]
# R= [Ly, Lx, Ry, Rx] = (LowerLeft-Y, LowerLeft-X, UpperRight-Y, UpperRight-X)

clusters = {}
#will hold end results, clustering overlapping rectangels together by key in mylist 
#{
#   1: [1,2]
#}

def isRectangleOverlap(R1: list, R2: list):
    if (R1[0] >= R2[2]) or (R1[2] <= R2[0]) or (R1[3] <= R2[1]) or (R1[1] >= R2[3]):
        return False
    else:
        return True

def hasAlreadyOverlapped(key: int):
    for k, v in clusters.items():
        if key in v:
            return k
    return None


#itterate over every set of Rectangle,RGB combination
for i in range(len(mylist)):
    cur_rect = mylist[i][0]
    current_cluster = None

    #check if current rectangle has already overlapped with previous rectangles,
    #and if so, any new overlappings with current rectangle will be added to that cluster
    #otherwise a new cluster will be created with current index i
    current_cluster = hasAlreadyOverlapped(i) or i 1

    #if this is a new cluster, then initiate it with current rectange in it
    if not (current_cluster in clusters):
        clusters[current_cluster] = [i]

    #check remaining rectangles in mylist for overlapping, and add them to current cluster
    #unless of course, if this is the last item.
    if i < len(mylist)-1:
        for j in range(i 1, len(mylist)):
            if isRectangleOverlap(cur_rect, mylist[j][0]):
                #this rectangle could already overlapped with a previouws processed rectangle, so we 
                #need to check for that.
                if not (j in clusters[current_cluster]):
                    clusters[current_cluster].append(j)

print(clusters)
#> {1: [0, 1, 3], 3: [2, 4]}
result = []
for _, rectangles in clusters.items():
    c = []
    for i in rectangles:
        c.append(mylist[i][1])
    result.append(c)

print(result)
#> [[[2, 3, 3], [2, 3, 4], [0, 5, 7]], [[1, 3, 6], [9, 9, 9]]]

Note that there is one problem here that I haven't accounted for. Let's say, rect0 and rect8 overlap, and rect3 and rect4 overlap. The above code would make clusters. But what if both clusters overlap with rect9 (rect9 stretching both over rect8 and rect3)?

I will have another thought about it later today, fun question!

UPDATE I had another moment to think about it and I came up with a better solution, using recursion. The above solution, as explained, iterates over rectangles and sees if there is any overlap (creating clusters). However, two independent clusters cannot be merged later on when encountering a bridging rectangle. To counter this effect, the logic would be that you explore a cluster when you find one, and after you have exhausted the current cluster of overlapping rectangles, you move on to new rectangles that form new clusters.

See the code below, I have amended a bit (created a class for example for Rectangles) for readability.

class Rectangle():
    """
    Making the objects in your list a bit more readable
    """
    def __init__(self, lleft_y: int, lleft_x: int, uright_y:int, uright_x:int, rgb: list) -> None:
        self.lleft_y = lleft_y
        self.lleft_x = lleft_x
        self.uright_y = uright_y
        self.uright_x = uright_x
        self.rgb = rgb

mylist = [
    Rectangle(1820, 57, 1920, 237, [2, 3, 3]), 
    Rectangle(1830, 90, 1920, 240, [2, 3, 4]), 
    Rectangle(1450, 0, 1667, 197, [1, 3, 6]), 
    Rectangle(1835, 90, 1920, 240, [0, 5, 7]),
    Rectangle(1452, 0, 1667, 197, [9, 9, 9])]

# This list will hold all clusters of overlapping rectangles by their index
clusters: list[list[int]] = []
# This list will keep track of all already processed rectangles by their index
processed_rectangles: list[int] = []


def doRectanglesOverlap(R1: Rectangle, R2: Rectangle):
    # if (R1 above of R2) or (R1 below of R2) or (R1 left of R2) or (R1 right of R2) 
    if (R1.lleft_y >= R2.uright_y) or (R1.uright_y <= R2.lleft_y) or (R1.uright_x <= R2.lleft_x) or (R1.lleft_x >= R2.uright_x):
        return False
    else:
        return True

def find_overlapping(index: int) -> list[int]:
    # First find all rectangles directly overlapping with current rectangle
    result: list[int] = []
    for j in range(len(mylist)):
        # If this rectangle was already processed because it was part of another cluster, just skip it
        if j in processed_rectangles:
            continue
        if doRectanglesOverlap(mylist[index], mylist[j]):
            processed_rectangles.append(j)
            result.append(j)
    # Now, for each found overlapping rectangle, we want to find the overlapping rectangles as well, 
    # effectively exploring the cluster outward. Recursive functions can be handy!
    for r in result:
        result  = find_overlapping(r)
    return result 
    
# Looping over all rectangles
for i in range(len(mylist)):

    # If this rectangle was already processed because it was part of another cluster, just skip it
    if i in processed_rectangles:
        continue
    processed_rectangles.append(i)
    # If it was not part of another earlier processed cluster, it is a new cluster. Let's add it.
    current_cluster = [i]
    # Now, we are going to loop over all remaining rectangles that haven't been processed yet, and add it to the current cluster
    other_rectangles = find_overlapping(i)
    # Due to the recursion, `other_rectangles` now hold the entire rest of the cluster.
    current_cluster = current_cluster   (other_rectangles)
    clusters.append(current_cluster)
    
print(clusters)
#> [[0, 1, 3], [2, 4]]

result = []
for ix, cluster in enumerate(clusters):
    result.append([])
    for i in cluster:
        result[ix].append(mylist[i].rgb)
print(result)
#> [[[2, 3, 3], [2, 3, 4], [0, 5, 7]], [[1, 3, 6], [9, 9, 9]]]

CodePudding user response：

A simple version that sets found items to an empty list. This is not very efficient since the list to compare to doesn't get shorter.

import copy

mylist = [[[1820, 57, 1920, 237], [2, 3, 3]], [[1830, 90, 1920, 240], [2, 3, 4]], [[1450, 0, 1667, 197], [1, 3, 6]], [[1835, 90, 1920, 240], [0, 5, 7]],[[1452, 0, 1667, 197], [9, 9, 9]]]


def isRectangleOverlap(R1, R2):
    if (not R1) or (not R2):
        return False
    if (R1[0] >= R2[2]) or (R1[2] <= R2[0]) or (R1[3] <= R2[1]) or (R1[1] >= R2[3]):
        return False
    else:
        return True


mylist2 = copy.deepcopy(mylist)
grouped = []
for i in range(len(mylist)):
    grouped.append([])
    for j in range(i   1, len(mylist)):
        if isRectangleOverlap(mylist[i][0], mylist2[j][0]):
            if not grouped[i]:  # include the i'th sublist
                grouped[i].append(mylist[i][1])
            grouped[i].append(mylist[j][1])
            mylist2[j] = [[], []]

grouped = [sublst for sublst in grouped if sublst]

print(grouped)

A bit more complicated way that removes the items from the list so it doesn't check them again:

mylist2 = copy.deepcopy(mylist)
grouped = []
for i in range(len(mylist)):
    grouped.append([])
    inds_2_remove = []
    for j in range(len(mylist2)):
        
        if mylist[i] == mylist2[j]:
            inds_2_remove.append(j)
            continue

        if isRectangleOverlap(mylist[i][0], mylist2[j][0]):
            if not grouped[i]:
                grouped[i].append(mylist[i][1])
            grouped[i].append(mylist2[j][1])
            inds_2_remove.append(j)
    for ind_2_remove in reversed(inds_2_remove):
        print(ind_2_remove)
        del mylist2[ind_2_remove]

grouped = [sublst for sublst in grouped if sublst]

print(grouped)

CodePudding user response：

Is that what you want? I basically do a nested for loop to construct a set of sets and then group together and remove duplicates using set operations.

def do_rectangles_overlap(R1, R2):
    return (
        False
        if (R1[0] >= R2[2]) or (R1[2] <= R2[0]) or (R1[3] <= R2[1]) or (R1[1] >= R2[3])
        else True
    )


mylist = [
    [[1820, 57, 1920, 237], [2, 3, 3]],
    [[1830, 90, 1920, 240], [2, 3, 4]],
    [[1450, 0, 1667, 197], [1, 3, 6]],
    [[1835, 90, 1920, 240], [0, 5, 7]],
    [[1452, 0, 1667, 197], [9, 9, 9]],
]

# Construct a set of sets
total_lists = set()

for i, _ in enumerate(mylist):
    temp_list = set()
    for j, _ in enumerate(mylist):
        if do_rectangles_overlap(mylist[i][0], mylist[j][0]):
            temp_list.add(tuple(mylist[i][1]))
            temp_list.add(tuple(mylist[j][1]))
        if temp_list:
            total_lists.add(tuple(temp_list))

total_lists = {frozenset(x) for x in total_lists if len(x) != 1}

# Remove duplicate subsets from the set
m = max(map(len,total_lists)) 
m = {x for x in total_lists if len(x) == m}
deduplicated = set()
for j in total_lists:
    if not all([j.issubset(i) for i in m]):
        deduplicated.add(j)
result = m.union(deduplicated)

# Return a list
result = [tuple(x) for x in result]
print(result)

Output:

[((1, 3, 6), (9, 9, 9)),
((0, 5, 7), (2, 3, 3), (2, 3, 4))]