Ignoring the leading space captured in a repeated group?-CodePudding

The following pattern matches a line that starts with 'v' followed by an arbitrary number of floats:

    const RegexOptions options = RegexOptions.Compiled | RegexOptions.Singleline | RegexOptions.CultureInvariant;

    var regex = new Regex(@"^\s*v((?:\s )[- ]?\b\d*\.?\d \b) $", options);

    const string text = @"
v  0.5  0.5  0.5 0.0 1.0 1.0
v  0.5 -0.5 -0.5 1.0 0.0 1.0
v -0.5  0.5 -0.5 1.0 1.0 0.0
v -0.5 -0.5  0.5 0.0 0.0 0.0
";

    using var reader = new StringReader(text);

    for (var s = reader.ReadLine(); s != null; s = reader.ReadLine())
    {
        if (string.IsNullOrWhiteSpace(s))
            continue;

        var match = regex.Match(s);

        if (match.Success)
        {
            foreach (Capture capture in match.Groups[1].Captures)
            {
                Console.WriteLine($"'{capture.Value}'");
            }
        }
    }

It works as expected except that it includes the leading space before a number:

'  0.5'
'  0.5'
'  0.5'
' 0.0'
' 1.0'
' 1.0'
...

Question:

How can I ignore the leading space for each captured number?

CodePudding user response：

You could change the regex to match the whitespace chars instead of capturing.

This part (?:\s ) is the same as just \s and as you repeat the pattern with 1 or more whitspace chars you can omit the word boundary \b at the end.

Note that in C# \d can match more than [0-9]

^\s*v(?:\s ([- ]?\b\d*\.?\d )) $

The line in C# would be:

var regex = new Regex(@"^\s*v(?:\s ([- ]?\b\d*\.?\d )) $", options);

Output

' 0.5'
' 0.5'
' 0.5'
'0.0'
'1.0'
'1.0'
' 0.5'
'-0.5'
'-0.5'
'1.0'
'0.0'
'1.0'
'-0.5'
' 0.5'
'-0.5'
'1.0'
'1.0'
'0.0'
'-0.5'
'-0.5'
' 0.5'
'0.0'
'0.0'
'0.0'

CodePudding user response：

You might be overcomplicating this. I suggest just using the following regex pattern:

[ -]?\d (?:\.\d )?

Your updated C# code:

var regex = new Regex(@"[ -]?\d (?:\.\d )?", options);