I am trying to create a regular expression which matches multiple groups, so the values between the groups can be extracted. Each group looks identical.
Lets consider the following example, note that the linebreaks are intended:
dog 1
wuff
wuff
cat
123
XYZ
dog 1
wuff
wuff
cat
456
ABC
dog 1
wuff
wuff
cat
789
Thus, with the right regular expression I want to get the output:
123
XYZ
456
ABC
789
On regex101.com I tried:
(?s)(?:dog.*cat)
which matches all values between the first occurence of dog an the last occurence of cat.
In addition I tried:
(?s)(?:dog.*(cat){1})
which, with my limited knowledge, should match the first occurence of cat and then end the group, but it does not.
I appreciate any help.
CodePudding user response:
You may use this regex in MULTILINE mode to capture value after dog.*cat
matches:
^dog\b(?:.*\n) ?cat\n(.*(?:\n.*)*?)(?=\ndog|\Z)
Your values are present in capture group #1
RegEx Details:
^
: Match start linedog\b
: Match worddog
with a word boundary(?:.*\n) ?
: Match anything followed by a line break. Repeat this 1 times (lazy)cat\n
: Matchcat
followed by a newline(.*(?:\n.*)*?)
: These are the multiline values you're interested in the first capture group.(?=\ndog|\Z)
: Lookahead to assert that we have adog
after line break or end of input ahead of the current position