I have a list of words
list1=['duck','crow','hen','sparrow']
and a list of sentences
list2=[['The crow eats'],['Hen eats blue seeds'],['the duck is cute'],['she eats veggies']]
I want to remove every occurance of the word 'eats' if it appears exactly after any of the words from the list.
desired output= [['The crow','Hen blue seeds','the duck is cute'],['she eats veggies']]
def remove_eats(test):
for i in test:
for j in i:
for word in list1:
j=j.replace(word " eats", word)
print(j)
break
remove_eats(list2)
The replace method is not really working for the strings. Could you help me out? Is it possble with Regex?
CodePudding user response:
You can use a regex such as the below, which has a series of alternating positive look-behinds. (Demo)
(?:(?<=[Dd]uck)|(?<=[Cc]row)|(?<=[Hh]en)|(?<=[Ss]parrow))\s eats?\b
Python example, using builtin re
module:
import re
list1 = ['duck', 'crow', 'hen', 'sparrow']
look_behinds = '|'.join(f'(?<=[{w[0].swapcase()}{w[0]}]{w[1:]})'
for w in list1)
EATS_RE = re.compile(rf'(?:{look_behinds})\s eats?\b')
sentences = [['The crow eats'],
['Hen eats blue seeds'],
['the duck is cute'],
['she eats veggies']]
repl_sentences = [[EATS_RE.sub('', s, 1) for s in x] for x in sentences]
print(repl_sentences)
Out:
[['The crow'], ['Hen blue seeds'], ['the duck is cute'], ['she eats veggies']]