I am dealing with getting the list of words of a text backwards from a particular position (in this example just the last position) up to any stopword (I have a list of stopwords).
The code I have is this:
stopwords = ['one','this','or']
mytext = 'this is a text with a car more than this other blue moon name'
result=[]
for word in mytext.split()[::-1]:
if word not in stopwords:
result.append(word)
else:
break
print((' ').join(result[::-1]))
This perfectly works. result is "other blue moon name". Now, I have the intuition (I can not prove) that there should be a better way than this super chunky code for such a little thing?
Any idea for a ONELINER???
CodePudding user response:
There's actually a reasonable way to do it in a one-liner using itertools:
from itertools import takewhile
stopwords = ['one','this','or']
mytext = 'this is a text with a car more than this other blue moon name'
result = " ".join(list(takewhile(lambda x: x not in stopwords, reversed(mytext.split())))[::-1])
Might be easier with regex, though
import re
stopwords = ['one','this','or']
mytext = 'this is a text with a car more than this other blue moon name'
# construct the regex matching string based on stopwords, instead of
# constructing it manually.
# Manual construction would just be r'.*(?:one|this|or)\W?(.*$)'
rstr = f'.*(?:{"|".join(stopwords)})\\W?(.*$)'
result = re.match(rstr, mytext).group(1)
CodePudding user response:
the only solution I can imagine without any loop is the following, but unfortunately you need to import numpy as np
import numpy as np
#input
stopwords = ['one','this','or']
mytext = 'this is a text with a car more than this other blue moon name'
#ONELINER
" ".join(mytext.split(" ")[-np.isin(mytext.split(" ")[::-1], stopwords).argmax():])
#output
'other blue moon name'