Home > OS >  Automatically find subsets in list of strings
Automatically find subsets in list of strings

Time:10-02

I have a list of strings that contains strings consisting of just one word or multiple words separated by ,. The list items are not unique and can be repeated in any possible combination. The list looks like this at the moment:

list = ['Car', 'Bed, Car', 'Car', 'House', 'Sofa, Pen, Car', 'Pen', 'Pen', 'Car, Pen', 'Car']

Now, I want all possible subsets with a length of at least two (consecutive). However, I want combinations of STRINGS, not WORDS: A combination would be 'Car' and 'Bed, Car' but not 'Car' and 'Bed' because they don't appear consecutively.

I have not found a way to do this yet. Everytime I try to find subsets, the code focuses on words instead of entire strings...

CodePudding user response:

This may not be very Pythonic, but it works:

my_list = ['Car', 'Bed, Car', 'Car', 'House', 'Sofa, Pen, Car', 'Pen', 'Pen', 'Car, Pen', 'Car']
result = []

for i in range(len(my_list)):
    for j in range(i   2, len(my_list)   1):
        result  = [my_list[i:j]]

print(result)

CodePudding user response:

Try using the combinations function from the itertools package

from itertools import combinations

my_list = ['Car', 'Bed, Car', 'Car', 'House', 'Sofa, Pen, Car', 'Pen', 'Pen', 'Car, Pen', 'Car']
my_set = set(my_list)
result = [comb for r in range(2, len(my_set)) for comb in combinations(my_set, r)]

This code makes a list of all possible unique combinations of at least two lengths.

  • Related