txt='thisis a red picture;thisis a yellow picture;thisis a good picture haha picture;thisis a bad picture haha picture;'
mycode=re.findall('thisis(. ?)picture',txt,re.DOTALL)
myresult:[' a red ', ' a yellow ', 'a bad',' a good ']
i need result: [' a red ', ' a yellow ']
1.i wish extract all text between 'thisis' and 'picture',but drop some cetain string "a good picture"and "a bad picture". 2.must use re.findall method
CodePudding user response:
Is there a pattern that these certain strings follow? Without this information, the best you can do is:
myresult.remove('a good')
myresult.remove('a bad')
CodePudding user response:
try this:
import re
regex = "thisis(. ?)picture"
text = 'thisis a red picture;thisis a yellow picture;thisis a good picture haha
picture;thisis a bad picture haha picture;'
data = re.findall('thisis(. ?)picture', text, re.IGNORECASE)
noise = ['good', 'bad']
result = [i.strip() for i in data if not any((j in i) for j in noise)]
print(result)
>>>> ['a red', 'a yellow']