How to extract string between two marker,but not equal to something?-CodePudding

txt='thisis a red picture;thisis a yellow picture;thisis a good picture haha picture;thisis a bad picture haha picture;'

mycode=re.findall('thisis(. ?)picture',txt,re.DOTALL)

myresult:[' a red ', ' a yellow ', 'a bad',' a good ']

i need result: [' a red ', ' a yellow ']

1.i wish extract all text between 'thisis' and 'picture',but drop some cetain string "a good picture"and "a bad picture". 2.must use re.findall method

CodePudding user response：

Is there a pattern that these certain strings follow? Without this information, the best you can do is:

myresult.remove('a good')
myresult.remove('a bad')

CodePudding user response：

try this:

import re

regex = "thisis(. ?)picture"
text = 'thisis a red picture;thisis a yellow picture;thisis a good picture haha 
picture;thisis a bad picture haha picture;'
data = re.findall('thisis(. ?)picture', text, re.IGNORECASE)
noise = ['good', 'bad']
result = [i.strip() for i in data if not any((j in i) for j in noise)]
print(result)

>>>>  ['a red', 'a yellow']