I have a task where I need to fetch N words before and after every substring (could be multiple words) in a string. I initially considered using str.split(" ") and work with the list but the issue is I'm fetching a substring which can be multiple words.
I've tried using str.partition and its very close to doing exactly what I want but it only gets the first keyword.
Code:
text = "Hello World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"
part = text.partition("Hello")
part = list(map(str.strip, part))
Output:
['', 'Hello', "World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"]
This gets me exactly what I need for the first keyword. I have enough to then get the prior and posterior words. Unfortunately, this fails me when the substring I'm looking for is repeating.
If the output could instead be a list of list partitions then I could actually make it work. How should I approach this?
CodePudding user response:
text = "Hello World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"
def recursive_partition(text, pattern):
if not text:
return text
tmp = text.partition(pattern)
if tmp and tmp[1]:
return [tmp[0]] [tmp[1]] recursive_partition(tmp[2], pattern)
else:
return [tmp[0]]
res = recursive_partition(text, "Hello")
print(res) # ['', 'Hello', ' World how are you doing ', 'Hello', " is the keyword I'm trying to get ", 'Hello', ' is a repeating word']