Home > Mobile >  Regex matching pattern in multiple lines without specific word in the match
Regex matching pattern in multiple lines without specific word in the match

Time:10-14

I would like to match the following pattern in multiple lines

  1. The pattern begins with "PAT_A"
  2. The pattern ends with the first ";" after "PAT_A"
  3. The pattern contains "PAT_B" between "PAT_A" and ";"
  4. The pattern does not contain "NOT_MATCH_THIS" between "PAT_A" and ";"

For example, this should make a match

PAT_A_YYY(
  OK,
  PAT_B
);

And this should not make a match.

PAT_A_XXX(
  NOT_MATCH_THIS,
  PAT_B
);

I managed to fulfill the first three requirements with

(PAT_A[^;]*?)(\bPAT_B\b)([^;]*;)

where the groups are for extracting the different parts matched.

However, I did not succeed in excluding matches containing "NOT_MATCH_THIS".

I have checked the post "How to negate specific word in regex?" about negative lookahead. However, it seems that the answer there matches the whole line instead of the pattern requirement described above. And I am not sure how I should incorporate the negative lookahead into my regex pattern.

Is there any way I could match with regex fulfilling all the four requirements?

CodePudding user response:

I don't have a RegEx interpreter handy, but you could try this:

(PAT_A[^;]*?(?!NOT_MATCH_THIS))(\bPAT_B\b)([^;]*;)

Or maybe:

(PAT_A[^;]*?(?!NOT_MATCH_THIS)[^;]*?)(\bPAT_B\b)([^;]*;)

CodePudding user response:

You might use

^PAT_A[^;\n]*(?:\n(?![^\n;]*NOT_MATCH_THIS)[^;\n]*)*\n[^;\n]*PAT_B[^;]*;

In parts, the pattern matches:

  • ^ Start of string
  • PAT_A Match literally
  • [^;\n]* Optionally match any char except ; or a newline
  • (?: Non capture group (to repeat as a whole)
    • \n(?![^\n;]*NOT_MATCH_THIS) Match a newline, and assert that the string does not contain NOT_MATCH_THIS and does not contain a ; or a newline to stay on the same line
    • [^;\n]* If the previous assertion is true, match the whole line (no containing a ;)
  • )* Close the non capture group, and optionally repeat matching all lines
  • \n[^;\n]* Match a newline, and any char except ; or a newline
  • PAT_B[^;]*; Then match PAT_B followed by any char except ; followed by matching the ;

Regex demo

  • Related