I want to limit the search between 2 words.
I have tried:
<Notes>(?:(?!<Notes>)[\s\S])*?sample(?:[\s\S\w] )<\/Notes>(?:(?<![\s\S\w]<\/Notes>))*?
with options /gmU
For Text:
TextBefore<Notes>this is sample notes</Notes>TextWiths1ample<Notes>this is sample notes</Notes>
and Text:
TextBefore<Notes>this is sample notes</Notes>TextWithsample<Notes>this is sample notes</Notes>
The screenshot below will give you an idea of what I want to achieve: succesfull.
But the screenshot below shows that the regex is not limited between the 2 words: failed
Hope someone can help me (there is a reason why not to parse this as XML).
Saved regexp: https://regex101.com/r/0fhNxI/1
CodePudding user response:
First of all, remove the U
flag, it is very confusing since it swaps lazy and greedy quantifiers. Then, make sure you exclude both <Notes>
and </Notes>
from matching before sample
. It is also a good idea to exclude the sample
, too, and use
<Notes>(?:(?!<\/?Notes>|sample)[\s\S])*sample[\s\S]*?<\/Notes>
Or,
<Notes>(?:(?!<\/?Notes>)[\s\S])*?sample[\s\S]*?<\/Notes>
See the regex demo #1 and regex demo #2.
Note that [\s\S\w]
= [\s\S]
.
Details:
<Notes>
- a fixed string(?:(?!<\/?Notes>)[\s\S])*?
- any char, zero or more occurrences (as few as possible), that does not start a<Notes>
or</Notes>
char sequencesample
- a fixed string[\s\S]*?
- any zero or more chars as few as possible<\/Notes>
- a fixed string.
CodePudding user response:
/<([\w\s]*>).*?</\1/ig
you need a back reference and laziness. maybe that was your problem.
check here the behaviour https://regex101.com/r/nVVsOQ/1