Home > Back-end >  Regex to match nth string separated by whitespace C#
Regex to match nth string separated by whitespace C#

Time:10-20

I have need to configure a regex to match nth item (whitespace separated). I have below so far which gets 3rd item from a line, however it is a group item. Is it possible to modify the regex to actually match the 3rd item as the first match in result?

https://regex101.com/r/FKscLq/1

Also is there an equivalent regex to match the nth number (whitespace separated)?

E.g. below string should match 2323 as 2nd number. String should return no matches for 3rd number.

Fiji 123545 27.10.1981 Westpac 2323 Bank 232dcc desc

Edit: I have got the regex to match nth word now. See below, it works beautifully. https://regex101.com/r/2F4J9o/1

I still need to get the nth number match though.

CodePudding user response:

To get a match only for the second number, you can use a positive lookbehind assertion with a quantifier {n} to match n times digits surrounded by whitespace characters using (?<!\S)\d (?!\S) so that it will not match for example 27.10.1981

In this pattern, the {n} is {1}

(?<=^(?:(?:(?!(?<!\S)\d (?!\S)).)*(?<!\S)\d (?!\S)){1}(?:(?!(?<!\S)\d (?!\S)).)*)(?<!\S)\d (?!\S)

Regex demo

Note that it is much easier to use a capture group:

^(?:(?:(?!(?<!\S)\d (?!\S)).)*(\d )){2}

Regex demo


To get the 3rd match for whitespace chars separated, you don't need any capture groups:

(?<=^(?:\S \s){2})\S 

Regex demo

CodePudding user response:

A simple (and likely faster) solution that doesn't completely rely on Regex:

static string GetNthNumber(string input, int whichOne)
{
    var words = input.Split(new[] { ' ', '\t' }, StringSplitOptions.RemoveEmptyEntries);
    var index = 0;
    const string pattern = @"^[0-9] $";
    var regex = new Regex(pattern);
    foreach (var word in words)
    {
        if (regex.IsMatch(word))
        {
            index  ;
            if (index == whichOne)
            {
                return word;
            }
        }
        if (index > whichOne)
        {
            return null;
        }
    }
    return null;
}

Some test code:

const string input = "Fiji 123545 27.10.1981 Westpac 2323 Bank 232dcc desc";
foreach (var index in Enumerable.Range(1, 4))
{
    var result = GetNthNumber(input, index);
    var show = result ?? "(null)";
    Console.WriteLine($"{index}: {show}");
}

Results:

1: 123545
2: 2323
3: (null)
4: (null)

If there are other characters that you consider whitespace, just add them to the array argument to string.Split

  • Related