distance of word in text using pandas-CodePudding

ive been trying to get the distance of words which are similar in the data text...but i was failure , is there any method in pandas to get distance of the similar words?

Code :

long_string = "one are marked by the ()meta-characters. two They group together the expressions contained one inside them, and you can one repeat the contents of a group with a repeating qualifier, such as there"

my_text = pd.Series(["{}".format(long_string)])

result = my_text.str.count("one")
print(result)
print(len(long_string))

#output
0    3
dtype: int64
196

So as you see im looking for the word one in text , word one has been used 3 times , for the first time its in index 0 and for the second time it has 12 word distance till it reach the second one in text... how do i get this distance using python or pandas?

CodePudding user response：

you can split the text by 'one' and measure the length of the elements or by counting spaces between the words.

long_string = "one are marked by the ()meta-characters. two They group together 
the expressions contained one inside them, and you can one repeat the contents 
of a group with a repeating qualifier, such as there"
text_split = long_string.split('one')[:-1]
length = []
for i in text_split:
    if i:
       length.append(i.count(' ')-1) 
print(length)

>>> [12, 5]