ive been trying to get the distance of words which are similar in the data text...but i was failure , is there any method in pandas to get distance of the similar words?
Code :
long_string = "one are marked by the ()meta-characters. two They group together the expressions contained one inside them, and you can one repeat the contents of a group with a repeating qualifier, such as there"
my_text = pd.Series(["{}".format(long_string)])
result = my_text.str.count("one")
print(result)
print(len(long_string))
#output
0 3
dtype: int64
196
So as you see im looking for the word one
in text , word one
has been used 3 times , for the first time its in index 0 and for the second time it has 12 word distance till it reach the second one
in text... how do i get this distance using python or pandas?
CodePudding user response:
you can split the text by 'one' and measure the length of the elements or by counting spaces between the words.
long_string = "one are marked by the ()meta-characters. two They group together
the expressions contained one inside them, and you can one repeat the contents
of a group with a repeating qualifier, such as there"
text_split = long_string.split('one')[:-1]
length = []
for i in text_split:
if i:
length.append(i.count(' ')-1)
print(length)
>>> [12, 5]