Here is the data I have.
t = 'Billy and Willy and Billy and someone'
words = ['Billy', 'Willy', 'Billy']
I planned to find the words in order. First I find Billy, then I shorten the line until the end of the word Billy.
for example:
new_t = ' and Willy and Billy and someone'
And then I planned to find Willy and etc.
So here what I have written:
t = 'Billy and Willy and Billy and someone'
words = ['Billy', 'Willy', 'Billy']
indexes = []
j = 0
for i in words:
l = re.search(i, t[j:]).span()
indexes.append(l)
j = l[1]
I know I did wrong, but can you help me to get result like this:
Billy = (0,5)
Willy = (10,15)
Billy = (20,25)
CodePudding user response:
To find exact substrings, you don't need re
. You can instead use str.index
:
t = 'Billy and Willy and Billy and someone'
words = ['Billy', 'Willy', 'Billy']
indexes = []
current_pos = 0
for word in words:
ind = t.index(word, current_pos)
indexes.append((ind, ind len(word)))
current_pos = ind 1
print(indexes) # [(0, 5), (10, 15), (20, 25)]
for w, i in zip(words, indexes):
print(w, '=', i)
# Billy = (0, 5)
# Willy = (10, 15)
# Billy = (20, 25)
The second parameter of index
is the starting position of the search, so you only need to update the starting position (current_pos
) once a search is done.
Or with walrus operator (python 3.8 ), you can shorten the second paragraph into
b = 0
indexes = [(a := t.index(w, b), b := a len(w)) for w in words]
CodePudding user response:
Using re
:
import re
t = 'Billy and Willy and Billy and someone'
words = 'Billy', 'Willy'
for match in re.finditer('|'.join(words), t):
print(f"{match[0]} = {match.span()}")
Billy = (0, 5)
Willy = (10, 15)
Billy = (20, 25)