Home > OS >  distance of word in text using pandas
distance of word in text using pandas

Time:08-20

ive been trying to get the distance of words which are similar in the data text...but i was failure , is there any method in pandas to get distance of the similar words?

Code :

long_string = "one are marked by the ()meta-characters. two They group together the expressions contained one inside them, and you can one repeat the contents of a group with a repeating qualifier, such as there"

my_text = pd.Series(["{}".format(long_string)])

result = my_text.str.count("one")
print(result)
print(len(long_string))

#output
0    3
dtype: int64
196

So as you see im looking for the word one in text , word one has been used 3 times , for the first time its in index 0 and for the second time it has 12 word distance till it reach the second one in text... how do i get this distance using python or pandas?

CodePudding user response:

you can split the text by 'one' and measure the length of the elements or by counting spaces between the words.

long_string = "one are marked by the ()meta-characters. two They group together 
the expressions contained one inside them, and you can one repeat the contents 
of a group with a repeating qualifier, such as there"
text_split = long_string.split('one')[:-1]
length = []
for i in text_split:
    if i:
       length.append(i.count(' ')-1) 
print(length)

>>> [12, 5]
  • Related