How to find and combine/group array buddy value together into several array?-CodePudding

I have an big array named with whoHasMe as below:

whoHasMe:  [[0], [1], [0, 2, 3], [1, 2, 3, 4, 5], [4], [5], [6, 7, 8], [7], [6, 8]]

As we can notice, these

whoHasMe[0] has relationship with whoHasMe[2]

whoHasMe[1] has relationship with whoHasMe[3]

whoHasMe[2] has relationship with whoHasMe[3]

whoHasMe[4] has relationship with whoHasMe[3]

whoHasMe[5] has relationship with whoHasMe[3]

whoHasMe[6] has relationship with whoHasMe[7]

whoHasMe[8] has relationship with whoHasMe[7]

How can we group it to become another new big array?

finalBigArray = [[0,1,2,3,4,5], [6,7,8]]

Note that: The value of whoHasMe Array might change. Seeking for help on how to find it's relationship and build a final big array.

Thank you and have a nice day.

CodePudding user response：

To provide a solution for reference, the logic is broken down into a few simple steps

remove nested lists with only one value that are contained in other nested lists, e.g. [0], [1]
combine the contents of the previous filtering step with the original nested list containing multiple values to form a new list
sort the list by least value, so that the sorting process is sequential and you don't have to start over each time
use set to determine if two lists have at least one identical element, and update the length of the nested list if they do, otherwise append directly to the result

def merge(intervals):
    # 1 and 2. Clear single element lists contained within other content and assemble new nested lists
    single_item = [item for item in intervals if len(item) == 1]
    multiply_item = [item for item in intervals if len(item) > 1]
    intervals = list(filter(lambda x: all(x[0] not in item for item in multiply_item), single_item))   multiply_item

    # 3. Sort by smallest value
    intervals = sorted(intervals, key=lambda x: min(x))

    # 4. Traversing to determine relevance
    res = [intervals[0]]       # Default is the first element
    for i in range(1, len(intervals)):  # Starting from index 1
        interval = intervals[i]
        if len(set(res[-1]) & set(interval)):   # Because it was previously ordered, it is only necessary to determine the last value of res
            res[-1] = sorted(set(res[-1]   interval))
        else:
            res.append(interval)
    return res

OUTPUT:

print(merge([[0], [1], [0, 2, 3], [1, 2, 3, 4, 5], [4], [5], [6, 7, 8], [7], [6, 8]]))
print(merge([[0, 1], [2, 3], [2, 3, 4, 5], [2, 4], [5], [6, 7, 8], [7], [6, 8]]))
print(merge([[5, 9], [1, 10], [10, 2, 5], [1, 2, 3, 4, 5], [4, 3, 1], [5], [6, 7, 8], [7], [6, 8], [21, 43]]))

# [[0, 1, 2, 3, 4, 5], [6, 7, 8]]
# [[0, 1], [2, 3, 4, 5], [6, 7, 8]]
# [[1, 2, 3, 4, 5, 9, 10], [6, 7, 8], [21, 43]]

Feel free to correct any mistakes and learn from each other

CodePudding user response：

The method I can think of is to use disjoint set. Here is a simple implementation:

class DisjointSet:

    def __init__(self, n):
        self.root = list(range(n))

    def __getitem__(self, k):    # find root
        return k if self.root[k] == k else self[self.root[k]]

    def union(self, a, b):
        x, y = self[a], self[b]
        if x != y:
            self.root[y] = x

First, convert each list in the given list into a set:

lst = [[0], [1], [0, 2, 3], [1, 2, 3, 4, 5], [4], [5], [6, 7, 8], [7], [6, 8]]
lst = list(map(set, lst))

Then judge whether there is a relationship between each pair of combinations. If so, union their indices:

length = len(lst)
ds = DisjointSet(length)
for i in range(length):
    for j in range(i   1, length):
        if lst[i] & lst[j]:
            ds.union(i, j)

The two-layer loop in previous step can be simplified through itertools.combinations:
from itertools import combinations
for (i, s1), (j, s2) in combinations(enumerate(lst), 2):
    if s1 & s2:
        ds.union(i, j)

With the help of defaultdict, we union sets in the same set:

from collections import defaultdict
d = defaultdict(set)
for i in range(length):
    d[ds[i]] |= lst[i]

Finally, convert the dictionary values to lists:

print(list(map(sorted, d.values())))

Output:

[[0, 1, 2, 3, 4, 5], [6, 7, 8]]

Update:

I have slightly modified the disjoint set, converted recursion into a loop, and enabled it to support iteration through the __getitem__ method and __len__ method. The final code is as follows:

from collections import defaultdict
from itertools import combinations


class DisjointSet:

    def __init__(self, n):
        self.root = [None] * n

    def __getitem__(self, k):
        root = self.root
        while (r := root[k]) is not None:
            k = r
        return k

    def __len__(self):
        return len(self.root)

    def union(self, a, b):
        x, y = self[a], self[b]
        if x != y:
            self.root[y] = x


def my_merge(lists):
    lists = list(map(set, lists))
    ds = DisjointSet(len(lists))
    for (i, s1), (j, s2) in combinations(enumerate(lists), 2):
        if s1 & s2:
            ds.union(i, j)

    d = defaultdict(set)
    for r, s in zip(ds, lists):
        d[r] |= s

    return list(map(sorted, d.values()))

Some test:

import random

for _ in range(3):
    lst = [sorted(random.randint(0, 50) for _ in range(random.randint(1, 6))) for _ in range(5)]
    print(f'lists:  {lst}')
    print(f'result: {my_merge(lst)}')

Output:

lists:  [[21, 26, 42, 44, 47], [4, 27, 40, 49], [5, 20, 39, 50], [14, 40, 47], [6]]
result: [[4, 14, 21, 26, 27, 40, 42, 44, 47, 49], [5, 20, 39, 50], [6]]
lists:  [[14, 26], [22], [4], [18, 22, 33, 37], [4, 5, 13, 19, 38, 46]]
result: [[14, 26], [18, 22, 33, 37], [4, 5, 13, 19, 38, 46]]
lists:  [[9, 9, 21, 31, 41], [7, 9, 28, 35, 43], [13, 23, 37, 42, 45], [4, 14, 21, 25], [0, 12, 14, 39, 50]]
result: [[0, 4, 7, 9, 12, 14, 21, 25, 28, 31, 35, 39, 41, 43, 50], [13, 23, 37, 42, 45]]