Home > Enterprise >  Regex: Find the minimal match... ? not cutting it
Regex: Find the minimal match... ? not cutting it

Time:06-27

I've read so many answers, and it is always to use the ? operator. I'm not sure what I'm missing.

Given a sample string such as:

the beginning stuff:0:evnt:some random stuff:MISTER GREEN:more stuff:1:evnt:again random stuff:MISS SCARLET:the rest of the stuff

What pattern can I use to get the "1" prior to "MISS SCARLET"? Without relying on knowing what's in any of the random stuff parts. I do know it will always be a digit followed by ":evnt" (without quotes) prior to the name we're looking for. In my case, there's no worry about other names getting in the way (ie. it would never be something like :3:evnt:blahblahMISTER GREEN blah blah MISS SCARLET without another x:evnt in between)

I originally came up with:

(\d):evnt.*MISS SCARLET

which gives "0" because it matches back to the index for Mister Green, since .* matches everything. Ok, so ? to the rescue:

(\d):evnt.*?MISS SCARLET

Nope, same result.

How do I get the CLOSEST match to the text in question? I'm sure I'm not wording that correctly, but the example should show what I want. I want to isolate that "1" with MISS SCARLET and "0" if I were looking for MISTER GREEN

Thanks.

CodePudding user response:

You may use this regex with a negated character class:

(\d ):evnt:[^:]*:MISS SCARLET

Where [^:]* matches 0 or more of any characters that is not :.

RegEx Demo

However if there is a possibility of presence of colon after evnt keyword then you can use this regex with a negative lookahead:

(\d ):evnt:(?:(?!:evnt:).)*?:MISTER GREEN

RegEx Demo 2

Here (?:(?!:evnt:).)*? will make sure match 0 or more of any characters as long as :evnt: doesn't appear anywhere in that text.

  • Related