Assuming a list as follows:
list_of_strings = ['foo', 'bar', 'soap', 'seo', 'paseo', 'oes']
and a sub string
to_find = 'eos'
I would like to find the string(s) in the list_of_strings
that match the sub string. The output from the list_of_strings
should be ['seo', 'paseo', 'oes']
(since it has all the letters in the to_find
sub string)
I tried a couple of things:
a = next((string for string in list_of_strings if to_find in string), None) # gives NoneType object as output
&
result = [string for string in list_of_strings if to_find in string] # gives [] as output
but both the codes don't work.
Can someone please tell me what is the mistake I am doing?
Thanks
CodePudding user response:
Your problem logically is comparing the set of characters in the word to find against the set of characters in each word in the list. If the latter word contains all characters in the word to find, then it is a match. Here is one approach using a list comprehension along with set intesection
:
list_of_strings = ['foo', 'bar', 'soap', 'seo', 'paseo', 'oes']
to_find = 'eos'
to_find_set = set(list(to_find))
output = [x for x in list_of_strings if len(to_find_set.intersection(set(list(x)))) == len(to_find_set)]
print(output) # ['seo', 'paseo', 'oes']
If you want to retain an empty string placeholder for any input string which does not match, then use this version:
output = [x if len(to_find_set.intersection(set(list(x)))) == len(to_find_set) else '' for x in list_of_strings]
print(output) # ['', '', '', 'seo', 'paseo', 'oes']
CodePudding user response:
Do you need the letters of to_find to be next to each other or just all the letters should be in the word? Basically: does seabco
match or not?
[Your question does not include this detail and you use "substring" a lot but also "since it has all the letters in the to_find", so I'm not sure how to interpret it.]
If seabco
matches, then @Tim Biegeleisen's answer is the correct one. If the letters need to be next to each other (but in any order, of course), then look below:
If the to_find
is relatively short, you can just generate all permutations of letters (n!
of them, so here (3!) = 6: eos, eso, oes, ose, seo, soe) and check in
.
import itertools
list_of_strings = ['foo', 'bar', 'soap', 'seo', 'paseo', 'oes']
to_find = 'eos'
result = [string for string in list_of_strings if any("".join(perm) in string for perm in itertools.permutations(to_find))]
https://docs.python.org/3/library/itertools.html#itertools.permutations
We do "".join(perm)
because perm is a tuple and we need a string.
>>> result = [string for string in list_of_strings if any("".join(perm) in string for perm in itertools.permutations(to_find))]
>>> result
['seo', 'paseo', 'oes']