Home > database >  Regex match words, but not when the entire string is matched
Regex match words, but not when the entire string is matched

Time:08-08

I have this regex now:

((\b(Example|Overflow|Test)\b)\s*(?!.*\2)) 

This matches the predefined words, up until a word is already used (if any). See below (matches in bold) for several examples:

  1. Example Overflow Test Overflow
  2. Example Overflow Test
  3. Just an Example Overflow
  4. Example another Overflow

This is doing what I ask it to, but I want to special case the 2nd example (Example Overflow Test). If the entire string is matched (from start to end), I don't want to match the first word anymore. So 2. should instead match "Example Overflow Test". My original regex has many more words, so I'd like to avoid repeating these and instead refer back to the same capture group similar to how I already check if a word is already used before.

CodePudding user response:

Since you need to lay a restriction on a part of a match (you still want to match a part of a string if there is a matching part), you can use something like

(?!^(?:\b(Example|Overflow|Test)\b(?!.*\b\1\b)\s*) $)(?:\b(Example|Overflow|Test)\b(?!.*\b\2\b)\s*) 

See the regex demo.

The (?!^(?:\b(Example|Overflow|Test)\b(?!.*\b\1\b)\s*) $) negative lookahead cancels the match if the word from the list is matched at the start and the match spans till string end at that position. Note the changed backreference numbers and the additional word boundaries around the backreferences to avoid getting partial matches in the lookahead.

In C#, you can define the repeating patterns with a variable and reuse it inside the string literal where you define the regex.

  • Related