Home > OS >  Regex - How to group multiple lines until line starts with a string?
Regex - How to group multiple lines until line starts with a string?

Time:08-18

I have a text file like the following which I am trying to create some regex for in Python:

CR INFO
CR INFO
Wed Aug 17

foo-bar name_10_Name-Child_test
foo-bar name_25_Name-out
foo-bar name_1000_Name-test_out

CR INFO
CR INFO
Wed Aug 17

foo-bar name_10_Name-Child_test
foo-bar name_25_Name-out
foo-bar name_1000_Name-test_out

Now I'm fairly new to regex so apologies if this is very simple.

I'm trying to capture the lines starting with foo-bar, and grouping them together. So for example, the 3 foo-bar lines in one group, then the 3 below the date go in to another.

I so far have the following regex (^foo-bar\s [A-z0-9-] ) but that matches every foo-bar line to an individual group, rather than having 3 in one group. Regex flags on regex101.com are gm.

How can I group the 3 lines together until it meets either the "CR" string, or a double new line?

Many thanks.

CodePudding user response:

You can use

^foo-bar\s [A-Za-z0-9-].*(?:\n. )*

Or, to make sure each next line start with foo-bar and whitespace:

^foo-bar\s [A-Za-z0-9-].*(?:\nfoo-bar\s.*)*

See the regex demo / regex demo #2. Use it with re.M / re.MULTILINE to make sure ^ matches the start of any line.

Details:

  • ^ - start of a line
  • foo-bar - a literal string
  • \s - one or more whitespaces
  • [A-Za-z0-9-] - an alphanumeric or hyphen
  • .* - the rest of the line
  • (?:\n. )* - zero or more non-empty lines
  • (?:\nfoo-bar\s.*)* - zero or more non-empty lines that start with foo-bar and whitespace.

Note that [A-z] matches more than just letters.

  • Related