Matching everything except for a character followed by a newline-CodePudding

This seems like a simple match, but I'm unable to figure out how to match all text that starts with a known block of text and ends with a semicolon newline. What I have right now mostly works:

pattern = r'''[ ] (value \w \n)([^;] )'''

For an example section of text that allows me to parse:

   value Y1N5NALC
      1 = 'Yes'  
      5 = 'No'  
      7 = 'Not ascertained' ;
   value AGESCRN
      15 = '15 years'  
      16 = '16 years';

However, if any of the key/value pairs contain a semicolon in the string the match fails early since the regex is looking for any semicolon. An example:

   value Y1N5NALC
      1 = 'Yes'  
      5 = 'No;Maybe'  
      7 = 'Not ascertained' ;

What I'd like to do is end the match by looking for a semicolon Optional(space or tab) newline. Using ([^;\n] ) fails since the newline gets match to the negative.

CodePudding user response：

You can use

(?sm)^  (value \w \n)(.*?);$

See the regex demo.

Details:

(?sm) - re.S and re.M are on
^ - start of a line
- one or more spaces
(value \w \r?\n) - Group 1: value, space, one or more word chars, and and an LF line break
(.*?) - Group 2:
; - a ;
$ - at the end of a line.

In case there can be CRLF endings, you need

(?sm)^  (value \w \r?\n)(.*?);\r?$