Home > Back-end >  Match with optional positive lookahead
Match with optional positive lookahead

Time:04-02

I've got 2 strings in the format: Some_thing_here_1234 Match Me 1 & 1234 Match Me 1_1

In both cases I want the resultant match to be 1234 Match Me 1

So far I've got (?<=^|_)\d{4}\s. which works but in the case of string 2 also captures the _1 at the end. I thought I could use a lookahead at the end with an optional such as (?<=^|_)\d{4}\s. (?=_\d{1}$|$) but it always seems to revert to the second option and so the _1 gets through.

Any help would be great

CodePudding user response:

You can use

(?<=^|_)\d{4}\s[^_] 

See the regex demo.

Details:

  • (?<=^|_) - a positive lookbehind that matches a location that is immediately preceded with either start of string or a _ char (equal to (?<![^_]))
  • \d{4} - four digits
  • \s - a whitespace
  • [^_] - one or more chars other than _.

CodePudding user response:

Your second pattern (?<=^|_)\d{4}\s. (?=_\d{1}$|$) is greedy and at the end of the string the second alternative |$ will match so you will keep matching the whole line.

Note that you can omit {1}

If you want to use an optional part in the lookahad, you can make the match non greedy and optionally match :_\d in the lookahead followed by the end of the string.

(?<=^|_)\d{4}\s. ?(?=(?:_\d)?$)

See a regex demo.

  • Related