Suppose you have a string:
text = "coding in python is a lot of fun"
And character positions:
positions = [(0,6),(10,16),(29,32)]
These are intervals, which cover certain words within text, i.e. coding, python and fun, respectively.
Using the character positions, how could you split the text on those words, to get this output:
['coding','in','python','is a lot of','fun']
This is just an example, but it should work for any string and any list of character positions.
I'm not looking for this:
[text[i:j] for i,j in positions]
CodePudding user response:
I'd flatten positions
to be [0,6,10,16,29,32]
and then do something like
positions.append(-1)
prev_positions = [0] positions
words = []
for begin, end in zip(prev_positions, positions):
words.append(text[begin:end])
This exact code produces ['', 'coding', ' in ', 'python', ' is a lot of ', 'fun', '']
, so it needs some additional work to strip the whitespace
CodePudding user response:
Below code works as expected
text = "coding in python is a lot of fun"
positions = [(0,6),(10,16),(29,32)]
textList = []
lastIndex = 0
for indexes in positions:
s = slice(indexes[0], indexes[1])
if positions.index(indexes) > 0:
print(lastIndex)
textList.append(text[lastIndex: indexes[0]])
textList.append(text[indexes[0]: indexes[1]])
lastIndex = indexes[1] 1
print(textList)
Output: ['coding', 'in ', 'python', 'is a lot of ', 'fun']
Note: If space are not needed you can trim them