Home > Software design >  regex multiline matching in python
regex multiline matching in python

Time:12-11

I want to filter for 'here is a sample' and all the lines afterwards until 2 new lines:

Here is my file (you can use it as a logfile):

here is a sample text
random line1


here is a sample text
random line2
random line3
random line4


should not match
random line 6


here is a sample 
random line 5

I tried:

    \r?\n?(here is a sample).*\r?\n?(.*)

With that I only filter the next line if I do the last part '\r?\n?(.*)' again I get another line..

My question. What regex expression do I need in order to match all lines until I see 2 new lines.

CodePudding user response:

If you want to match all until you have 2 newline, but also want to match the last occurrence if there are no 2 newlines:

^here is a sample.*(?:\n(?!\n).*)*

The pattern matches:

  • ^ Start of string
  • here is a sample.* Match literally and the rest of the line
  • (?: Non capture group to repeat as a whole part
    • \n(?!\n) Match a newline, and assert that it is not directly followed by a newline
    • .* Match the rest of the line
  • )* Close the non capture group and optionally repeat it

Regex demo

If there should be 2 newlines present, you can use a capture group for the part that you want to keep, and match the 2 newlines to make sure that they are present.

^(here is a sample.*(?:\n(?!\n).*)*)\n\n

Regex demo

  • Related