Pairing list items in a list-CodePudding

I am trying to pair elements of a list based on a condition.

If two element have a common i will merge them and do this until no elements can be merged. Currently, my problem is looping through same elements and getting same merged result from different items. I have to check if group has been added before.But as my array is empty in the beginning i could not check if element already in it with axis 1. I tried recursive : Also i am discarding if a group has length less than three.

pairs = [[1, 3], [1, 8], [2, 1], [2, 3], [3, 1], [3, 8], [4, 11], [4, 15], [7, 13], [9, 12], [9, 13], [10, 1], [10, 18], [10, 20], ...]


def groupG(pairs):
    groups = []
    if len(pairs) > 1:
        for i,pair in enumerate(pairs):
            try:
                if (any(point in pairs[i 1] for point in pair)):
                    group = np.concatenate(( pair,pairs[i 1]))
                    group = np.unique(group)
                    groups.append(group)
            except IndexError:
                continue
            
    if len(groups) == 0 :
        groupsFiltered = np.array([row for row in pairs if len(row)>=3])
        return groupsFiltered
    else: 
        return groupG(groups)

expected result is :

[[1,2,3,8,10,18,20],[4,11,15],[7,9,12,13]...]

Is there a way to group these pairs with while,do while or recursive?

CodePudding user response：

Use networkx's connected_components:

import networkx as nx

pairs = [[1, 3], [1, 8], [2, 1], [2, 3], [3, 1], [3, 8], [4, 11], [4, 15],
         [7, 13], [9, 12], [9, 13], [10, 1], [10, 18], [10, 20]]

out = list(nx.connected_components(nx.from_edgelist(pairs)))

output:

[{1, 2, 3, 8, 10, 18, 20}, {4, 11, 15}, {7, 9, 12, 13}]

CodePudding user response：

This could be done with sets and the reduce function.

from functools import reduce

def merge(groups, pair):
    for i, group in enumerate(groups):
        if group.intersection(pair):
            groups[i] = group.union(pair)
            break
    else:
        groups.append(set(pair))
    return groups

assert(merge([], [1, 3]) == [{1, 3}])
assert(merge([{1, 3}], [1, 8]) == [{1, 3, 8}])
assert(merge([{1, 3}], [4, 11]) == [{1, 3}, {4, 11}])

pairs = [[1, 3], [1, 8], [2, 1], [2, 3], [3, 1], [3, 8], [4, 11], 
         [4, 15], [7, 13], [9, 12], [9, 13], [10, 1], [10, 18], 
         [10, 20]]
groups = reduce(merge, pairs, [])
print(groups)

Output:

[{1, 2, 3, 18, 20, 8, 10}, {11, 4, 15}, {9, 13, 7}, {9, 12}]

UPDATE:

As pointed out in the comments, this solution is not sufficient because there may be residual groups that still need merging.

You can add a recursive loop at the end:

groups = reduce(merge, pairs, [])
while True:
    n = len(groups)
    groups = reduce(merge, groups, [])
    if len(groups) == n:
        break
print(groups)

Output:

[{1, 2, 3, 8, 10, 18, 20}, {4, 11, 15}, {7, 9, 12, 13}]

Another way is to do the recursion within the merge function:

def merge(groups, pair):
    for i, group in enumerate(groups):
        if group.intersection(pair):
            group = group.union(pair)
            groups.pop(i)
            # Try to merge the modified group with
            # the others (recursive)
            groups = reduce(merge, [group], groups)
            break
    else:
        groups.append(set(pair))
    return groups

groups = reduce(merge, pairs, [])

Not sure if these are efficient!

CodePudding user response：

Well, you can do this :

def group_pairs(available_pairs):
    # check if there was any new merge to break the loop otherwise
    is_modified = True

    # a while loop on the 'is_modified' condition
    while is_modified:
        # init the new available pairs, and the pair to start the merging with
        new_pairs = available_pairs.copy()
        merged_pairs = available_pairs[0]

        #merge if eny similarity is noticed
        for i in range(len(available_pairs)):
            if any(el in available_pairs[i] for el in merged_pairs):
                merged_pairs  = available_pairs[i]
                new_pairs.remove(available_pairs[i])

        # add the merged data
        new_pairs.append(list(set(merged_pairs)))

        # update the while checker
        is_modified = len(new_pairs) != len(available_pairs)

        # update the available pairs
        available_pairs = new_pairs.copy()
    
    return available_pairs

# a little test
pairs = [[1, 3], [1, 8], [2, 1], [2, 3], [3, 1], [3, 8], [4, 11], [4, 15],
         [7, 13], [9, 12], [9, 13], [10, 1], [10, 18], [10, 20]]

print(group_pairs(pairs))

output:

[[11, 4, 15], [9, 12, 13, 7], [1, 2, 3, 8, 10, 18, 20]]