Home > other >  How to find multiple words within a string that lie within a range(positon)?
How to find multiple words within a string that lie within a range(positon)?

Time:03-24

What is the fastest way to see if multiple words are present in a string within a certain number of positions? Order does not matter.

For example:

Words: "banana","man","he"

String: He is a banana man and eats like crazy.

Range: 5

Function would return a match, as "he", "banana" and "man" lie within a distance of 5 words

But if I give the range: 2, function would not return a match as "man" is atleast 3 words from "he"

CodePudding user response:

Try this:

input = "He is a banana man and eats like crazy"

range = 5
required_words = ' '.join(input.split(' ')[:range])
print(re.findall(r'\b(man|banana|he)\b', required_words, re.I))
# ['He', 'banana', 'man']

range = 2
required_words = ' '.join(input.split(' ')[:range])
print(re.findall(r'\b(man|banana|he)\b', required_words, re.I))
# ['He']

Here you can specify the range, it returns an empty list if no match is found.

First you get the required words by splitting them at whitespace and then joining the required number(your range) of words.

Then you use regex to check if your words are present in this range of words.

CodePudding user response:

Assuming the words in the string will always be unique, you could use a dict to map words to their indices, then iterate over the combinations of words and find the differences of their positions:

import itertools as it


s = "He is a banana man and eats like crazy"
words = ["banana", "he", "man"]


max_range = 2
inds = dict((v.lower(), i) for i, v in enumerate(s.split()))
diffs = [abs(inds[a] - inds[b]) for a, b in it.combinations(words, 2)]
match = all(diff <= max_range for diff in diffs)
  • Related