I have a list of lists and I want to group them according to at least one element in common for at least two sublists of a group. Therefore, each sublist in a specific group doesn't need to have an element in common with every sublist in its group, but it is enough that each sublist has at least one element in common with another sublist of its group - in other words, they can form a chain.
Example:
In: f(ls = [[3, 4, 2], [1, 3, 5], [9, 8, 7], [8, 10, 34], [34], [1], [23, 22, 21], [22], [21, 29], [100]])
Out:
{0: [[3, 4, 2], [1, 3, 5], [1]], 1: [[9, 8, 7], [8, 10, 34], [34]], 2: [[23, 22, 21], [22], [21, 29]], 3: [[100]]}
Who is f()
?
I'll keep trying to solve it for while.
CodePudding user response:
This implementation has been tested on the example provided. Efficiency and brevity have both been sacrificed for the sake of readability.
The general plan is this: we start with each list in a separate group, and then combine groups with overlapping elements. We continue until no two groups overlap, and then we return the groups in the format required.
To make our code a little cleaner, we define a helper object called Group
that will help us track the groups as we put them together. A Group
includes two things: a list of lists, and a set that combines all items in those lists.
@dataclass(eq=False)
class Group:
lists: list[list]
all_items: set
Then, we define a utility function that takes a list of groups and returns two that have overlapping elements, if any such are found (otherwise it returns two None
s):
def find_intersecting_groups(groups: list[Group]) -> Tuple[Group, Group] | Tuple[None, None]:
# Iterate over all groups (except the last one)
for first_index, first_group in enumerate(groups[:-1]):
# Iterate over all groups after this group
for second_group in groups[first_index 1:]:
# If the intersection of the groups' items is not empty,
# i.e., the groups share at least one item
if first_group.all_items & second_group.all_items:
# Return the groups found
return first_group, second_group
# Otherwise, return two Nones
return None, None
Now we can write the function requested in a readable way:
def f(ls):
# Start with one group per list
groups = [Group([l], set(l)) for l in ls]
# Loop until we reach equilibrium
while True:
# Find groups to be merged
first_group, second_group = find_intersecting_groups(groups)
# If none were found, we are at equilibrium
if first_group is None:
break
# Otherwise, merge the second group into the first group
first_group.lists = second_group.lists
first_group.all_items |= second_group.all_items
# And then delete the second group
groups.remove(second_group)
# Return a dictionary, where the key is the index of the group,
# and the value is the lists in the group
return {index: group.lists for (index, group) in enumerate(groups)}