I want to capture "foo" and any occurrence of "bar". Plus, I need to ignore any string between them and bar is optional.
Example text:
foo ignoreme barbarbar
foo ignoreme bar
foo ignoreme
foo something abcbar
Expected:
foo barbarbar
foo bar
foo
foo bar
A tried with this regex :
(foo)(?:.*)((?:bar)*)
But the .*
capture all the end of the string:
foo
foo
foo
foo
So I changed it to lazy to stop the capture:
(foo)(?:.*?)((?:bar)*)
I almost got the same result, only foo is captured.
It seems it stop the capture to early, however, this almost works:
(foo)(?:.*?)((?:bar) )
foo barbarbar
foo bar
<miss third line>
foo bar
But it misses the third line because the pattern "bar" must appear one time. Example here https://regex101.com/r/NIUPew/1
Any idea from a regex guru? Thanks!
CodePudding user response:
You can move the repeated capturing group into the non-capturing group while making that group optional:
(foo)(?:.*?((?:bar) ))?
See the regex demo.
Details:
(foo)
- Group 1:foo
(?:.*?((?:bar) ))?
- an optional non-capturing group that will be tried at least once (because?
is a greedy quantifier matching the quantified pattern one or zero times) to match.*?
- any zero or more chars other than line break chars as few as possible((?:bar) )
- Group 2: one or morebar
char sequences.
CodePudding user response:
You can search using this regex:
(\bfoo) .*?(?: \w*?((?:bar) )\w*)?$
and replace with:
$1 $2
RegEx Breakup:
(\bfoo)
: 1st capture group to matchfoo
after a word boundary.*?
: Followed by a space any text (lazy match)(?:
: Start non-capture group with a space\w*?
: Match 0 or more word chars (lazy)((?:bar) )
: Match 1 repetitions ofbar
in capture group #2\w*
: Match 0 or more word chars
)?
: End non-capture group.?
makes this optional match$
: End
PS: Regex can be shortened to (\bfoo) .*?(?:((?:bar) )\w*)?$
but it will be bit more slow.