Let's say we have a list of N lists. For example:
L = [['A','B','C','D','E'], ['A','B','C'],['B','C','D'],['C','D'],['A','C','D']]
I want to find the longest common subsets that occur in this list and the corresponding counts. In this case:
ans = {'A,B,C':2, 'A,C,D':2, 'B,C,D':2}
I think this question is similar to mine, but I am having a hard time understanding the C# code.
CodePudding user response:
I assume that a "common subset" is a set that is a subset of at least two lists in the array.
With that in mind, here's one solution.
from itertools import combinations
from collections import Counter
L = [['A','B','C','D','E'], ['A','B','C'],['B','C','D'],['C','D'],['A','C','D']]
L = [*map(frozenset,L)]
sets = [l1&l2 for l1,l2 in combinations(L,2)]
maxlen = max(len(s) for s in sets)
sets = [s for s in sets if len(s) == maxlen]
count = Counter(s for s in sets for l in L if s <= l)
dic = {','.join(s):k for s,k in count.items()}
Resulting dictionary dic
:
{'A,B,C': 2, 'B,C,D': 2, 'A,C,D': 2}