I have a very large text file with several entries like this:
-------------------------------------
LOTS OF
MULTILINE
TEXT
*************************************
MORE
MULTILINE
TEXT
*************************************
EVEN MORE
*************************************
-------------------------------------
2ND LOT OF
MULTILINE
TEXT
*************************************
MORE
MULTILINE
TEXT FOR 2ND LOT
*************************************
EVEN MORE TEXT FOR 2ND
*************************************
Note that these are only two entries, I don't care about the asterisks, but the text that follows the dashed line.
I want to get a capture group with all the text in each entry so that I can analyze it later line by line.
I can capture the first entry with an expression like this:
/-{37}\s*([\s\S] )-{37}/gm
But I'm having trouble running the capture group several times because I don't have a clear terminator for the groups (since the *{37} appears several times)
Here's a regex 101 example:
https://regex101.com/r/XZQ5h6/1
How can I capture the text after the dashed line but before the next dashed line or the end of the file?
CodePudding user response:
You can use this regex:
-{37}\R ((?:. \R) )
RegEx Detail;
-{37}
: Match hyphen of 37 in length\R
: Match 1 of line breaks(
: Start capture group(?:. \R)
: Match a line of 1 character followed by a line break. Repeat this group 1 times to match multiple of these lines
)
: End capture group
CodePudding user response:
This regex will match both entries:
/-{37}[^-] /gm
Try it out in regex101.