Home > OS >  Multiline mode in Regex not performing according to documentation
Multiline mode in Regex not performing according to documentation

Time:02-10

According to the .Net Standard:

The RegexOptions.Multiline option, or the m inline option, enables the regular expression engine to handle an input string that consists of multiple lines. It changes the interpretation of the ^ and $ language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string.

By default, $ matches only the end of the input string. If you specify the RegexOptions.Multiline option, it matches either the newline character (\n) or the end of the input string.

(emphasis mine)

Source: https://docs.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-options#multiline-mode

This seems to say that $ matches the newline character. However this does not seem to be the case. The code:

var m = Regex.Match("123\n456", @"123$", RegexOptions.Multiline);
Console.WriteLine(m.Length);

prints 3, not 4 as would be expected if $ matched newline.

Is this a bug? A documentation error?

CodePudding user response:

This is a documentation error as any anchor (^, $, \A, \Z, \z, \G) are zero-width assertions and do not consume any text. Non-consuming patterns only match positions inside a string, not the text itself.

If you specify the RegexOptions.Multiline option, the $ anchor matches either the position immediately before a newline character (\n, line feed (LF) char) or the end of the input string.

  • Related