I'm looking for a single regex written for nodejs that can capture only text in lines that start with PASS! or FAIL! and appears between two specific words. Example:
INFO! this line shouldn't be captured because it's before section121
[section120] section title1
Some noise
PASS! this line shouldn't captured either because it's before section121
[section121] section title2
more noise
FAIL! match1
a warning we wish to skip
more warnings
PASS! match2
FAIL! match3
[section122] section title3
noise
PASS! this shouldn't be captured because it appears after section122
The expected captures for this input are:
match1
match2
match3
Can this be achieved using a single regex? If not, an explanation why would also be accepted as an answer.
I tried writing several different regexes, but always ended up capturing only the last line (match3):
section121\][\s\S]*(?:PASS!|FAIL!)([\s\S]*)\[section122
CodePudding user response:
With JavaScript the support of a lookbehind assertion, you can use:
(?<=^\[section121].*(?:\n(?!\[section\d ]).*)*\n(?:PASS|FAIL)!).*
Explanation
(?<=
Positive lookbehind^
Start of string\[section121].*
Match[section121]
and the rest of the line(?:\n(?!\[section\d ]).*)*
Match a newline, and repeat matching all lines that do not start with[section
1 digits and]
\n(?:PASS|FAIL)!
Match a newline and eitherPASS!
orFAIL!
)
Close the lookbehind.*
Match the rest of the line (optionally match any character except newlines)
See a regex101 demo
const regex = /(?<=^\[section121].*(?:\n(?!\[section\d ]).*)*\n(?:PASS|FAIL)!).*/gm;
const s = `INFO! this line shouldn't be captured because it's before section121
[section120] section title1
Some noise
PASS! this line shouldn't captured either because it's before section121
[section121] section title2
more noise
FAIL! match1
a warning we wish to skip
more warnings
PASS! match2
FAIL! match3
[section122] section title3
noise
PASS! this shouldn't be captured because it appears after section122`;
console.log(s.match(regex));
An alternative without the support for a lookbehind in 2 steps:
const regex = /\[section121].*(?:\n(?!\[section\d ]|(?:PASS|FAIL)!).*)*\n(?:PASS|FAIL)!.*(?:\n(?!\[section\d ]).*)*/;
const s = `INFO! this line shouldn't be captured because it's before section121
[section120] section title1
Some noise
PASS! this line shouldn't captured either because it's before section121
[section121] section title2
more noise
FAIL! match1
a warning we wish to skip
more warnings
PASS! match2
FAIL! match3
[section122] section title3
noise
PASS! this shouldn't be captured because it appears after section122`;
const res = s.match(regex);
if (res) {
console.log(Array.from(res[0].matchAll(/^(?:PASS|FAIL)!(.*)/mg), m => m[1]))
}