I would like to match the following pattern in multiple lines
- The pattern begins with "PAT_A"
- The pattern ends with the first ";" after "PAT_A"
- The pattern contains "PAT_B" between "PAT_A" and ";"
- The pattern does not contain "NOT_MATCH_THIS" between "PAT_A" and ";"
For example, this should make a match
PAT_A_YYY(
OK,
PAT_B
);
And this should not make a match.
PAT_A_XXX(
NOT_MATCH_THIS,
PAT_B
);
I managed to fulfill the first three requirements with
(PAT_A[^;]*?)(\bPAT_B\b)([^;]*;)
where the groups are for extracting the different parts matched.
However, I did not succeed in excluding matches containing "NOT_MATCH_THIS".
I have checked the post "How to negate specific word in regex?" about negative lookahead. However, it seems that the answer there matches the whole line instead of the pattern requirement described above. And I am not sure how I should incorporate the negative lookahead into my regex pattern.
Is there any way I could match with regex fulfilling all the four requirements?
CodePudding user response:
I don't have a RegEx interpreter handy, but you could try this:
(PAT_A[^;]*?(?!NOT_MATCH_THIS))(\bPAT_B\b)([^;]*;)
Or maybe:
(PAT_A[^;]*?(?!NOT_MATCH_THIS)[^;]*?)(\bPAT_B\b)([^;]*;)
CodePudding user response:
You might use
^PAT_A[^;\n]*(?:\n(?![^\n;]*NOT_MATCH_THIS)[^;\n]*)*\n[^;\n]*PAT_B[^;]*;
In parts, the pattern matches:
^
Start of stringPAT_A
Match literally[^;\n]*
Optionally match any char except;
or a newline(?:
Non capture group (to repeat as a whole)\n(?![^\n;]*NOT_MATCH_THIS)
Match a newline, and assert that the string does not containNOT_MATCH_THIS
and does not contain a;
or a newline to stay on the same line[^;\n]*
If the previous assertion is true, match the whole line (no containing a;
)
)*
Close the non capture group, and optionally repeat matching all lines\n[^;\n]*
Match a newline, and any char except;
or a newlinePAT_B[^;]*;
Then match PAT_B followed by any char except;
followed by matching the;