Home > Mobile >  How to get surrounding words of substring in string, if the substring repeats itself?
How to get surrounding words of substring in string, if the substring repeats itself?

Time:07-01

I have a task where I need to fetch N words before and after every substring (could be multiple words) in a string. I initially considered using str.split(" ") and work with the list but the issue is I'm fetching a substring which can be multiple words.

I've tried using str.partition and its very close to doing exactly what I want but it only gets the first keyword.

Code:

text = "Hello World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"
part = text.partition("Hello")
part = list(map(str.strip, part))

Output:

['', 'Hello', "World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"]

This gets me exactly what I need for the first keyword. I have enough to then get the prior and posterior words. Unfortunately, this fails me when the substring I'm looking for is repeating.

If the output could instead be a list of list partitions then I could actually make it work. How should I approach this?

CodePudding user response:

text = "Hello World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"

def recursive_partition(text, pattern):
  if not text:
    return text
  tmp = text.partition(pattern)
  if tmp and tmp[1]:
    return [tmp[0]]   [tmp[1]]   recursive_partition(tmp[2], pattern)
  else:
    return [tmp[0]]

res = recursive_partition(text, "Hello")
print(res)  # ['', 'Hello', ' World how are you doing ', 'Hello', " is the keyword I'm trying to get ", 'Hello', ' is a repeating word']
  • Related