Home > Back-end >  Remove a particular word in a list of lists if it appears after a set of words
Remove a particular word in a list of lists if it appears after a set of words

Time:09-03

I have a list of words list1=['duck','crow','hen','sparrow'] and a list of sentences list2=[['The crow eats'],['Hen eats blue seeds'],['the duck is cute'],['she eats veggies']] I want to remove every occurance of the word 'eats' if it appears exactly after any of the words from the list.

desired output= [['The crow','Hen blue seeds','the duck is cute'],['she eats veggies']]

def remove_eats(test):
  for i in test:
    for j in i:
     for word in list1:
        j=j.replace(word   " eats", word)
        print(j)
        break

remove_eats(list2)

The replace method is not really working for the strings. Could you help me out? Is it possble with Regex?

CodePudding user response:

You can use a regex such as the below, which has a series of alternating positive look-behinds. (Demo)

(?:(?<=[Dd]uck)|(?<=[Cc]row)|(?<=[Hh]en)|(?<=[Ss]parrow))\s eats?\b

Python example, using builtin re module:

import re

list1 = ['duck', 'crow', 'hen', 'sparrow']

look_behinds = '|'.join(f'(?<=[{w[0].swapcase()}{w[0]}]{w[1:]})'
                        for w in list1)

EATS_RE = re.compile(rf'(?:{look_behinds})\s eats?\b')

sentences = [['The crow eats'],
             ['Hen eats blue seeds'],
             ['the duck is cute'],
             ['she eats veggies']]

repl_sentences = [[EATS_RE.sub('', s, 1) for s in x] for x in sentences]
print(repl_sentences)

Out:

[['The crow'], ['Hen blue seeds'], ['the duck is cute'], ['she eats veggies']]
  • Related