Home > Back-end >  How to group lists according to at least one element in common in Python?
How to group lists according to at least one element in common in Python?

Time:11-13

I have a list of lists and I want to group them according to at least one element in common for at least two sublists of a group. Therefore, each sublist in a specific group doesn't need to have an element in common with every sublist in its group, but it is enough that each sublist has at least one element in common with another sublist of its group - in other words, they can form a chain.

Example:

In: f(ls = [[3, 4, 2], [1, 3, 5], [9, 8, 7], [8, 10, 34], [34], [1], [23, 22, 21], [22], [21, 29], [100]])

Out:
{0: [[3, 4, 2], [1, 3, 5], [1]], 1: [[9, 8, 7], [8, 10, 34], [34]], 2: [[23, 22, 21], [22], [21, 29]], 3: [[100]]}

Who is f()?

I'll keep trying to solve it for while.

CodePudding user response:

This implementation has been tested on the example provided. Efficiency and brevity have both been sacrificed for the sake of readability.

The general plan is this: we start with each list in a separate group, and then combine groups with overlapping elements. We continue until no two groups overlap, and then we return the groups in the format required.

To make our code a little cleaner, we define a helper object called Group that will help us track the groups as we put them together. A Group includes two things: a list of lists, and a set that combines all items in those lists.

@dataclass(eq=False)
class Group:
    lists: list[list]
    all_items: set

Then, we define a utility function that takes a list of groups and returns two that have overlapping elements, if any such are found (otherwise it returns two Nones):

def find_intersecting_groups(groups: list[Group]) -> Tuple[Group, Group] | Tuple[None, None]:
    # Iterate over all groups (except the last one)
    for first_index, first_group in enumerate(groups[:-1]):
        # Iterate over all groups after this group
        for second_group in groups[first_index   1:]:
            # If the intersection of the groups' items is not empty,
            # i.e., the groups share at least one item
            if first_group.all_items & second_group.all_items:
                # Return the groups found
                return first_group, second_group

    # Otherwise, return two Nones
    return None, None

Now we can write the function requested in a readable way:

def f(ls):
    # Start with one group per list
    groups = [Group([l], set(l)) for l in ls]

    # Loop until we reach equilibrium
    while True:
        # Find groups to be merged
        first_group, second_group = find_intersecting_groups(groups)

        # If none were found, we are at equilibrium
        if first_group is None:
            break

        # Otherwise, merge the second group into the first group
        first_group.lists  = second_group.lists
        first_group.all_items |= second_group.all_items

        # And then delete the second group
        groups.remove(second_group)

    # Return a dictionary, where the key is the index of the group,
    # and the value is the lists in the group
    return {index: group.lists for (index, group) in enumerate(groups)}
  • Related