Home > Net >  Regex .NET OR operator not working on alternative group
Regex .NET OR operator not working on alternative group

Time:07-03

I need a regex which validate string of numbers either math "aabb" or "abba" pattern.

For example: both 1122 or 1221 is valid

Regex for both "aabb", "abba" worked fine alone. But when i'm trying to combine "aabb" OR "abba", the result of "aabb" is always false. (1122 returned not valid)

Here is my implementation in C#:

string phoneNumber = "1221"; // "1122" failed
Dictionary<string, string> subPatterns = new Dictionary<string, string>();
subPatterns[@"(\d)(\d)\2\1$"] = "abba";
subPatterns[@"(\d)\1(\d)\2$"] = "aabb";
string pattern = string.Join("|", subPatterns.Select(e => e.Key));
                
foreach (Match m in Regex.Matches(phoneNumber, pattern))
{               
    if (m.Success)
    {
        Console.WriteLine("TRUE");
    }               
}

Did i missed something?

CodePudding user response:

The alternation changes the capture group numbers. You can either account for the incremented numbers in the alternation:

subPatterns[@"(\d)(\d)\2\1$"] = "abba";
subPatterns[@"(\d)\3(\d)\4$"] = "aabb";

The pattern will look like this, matching the 4 digits at the end of the string due to the $

(\d)(\d)\2\1$|(\d)\3(\d)\4$

Or you can use the same named backreferences:

subPatterns[@"(?<1>\d)\k<1>(?<2>\d)\k<2>"] = "abba";
subPatterns[@"(?<1>\d)(?<2>\d)\k<2>\k<1>"] = "aabb";

The pattern will then look like

(?<1>\d)(?<2>\d)\k<2>\k<1>|(?<1>\d)\k<1>(?<2>\d)\k<2>

Note that if the matches are for the whole line, you can append an anchor ^ to it and the whole pattern will look like

^(?:(?<1>\d)(?<2>\d)\k<2>\k<1>|(?<1>\d)\k<1>(?<2>\d)\k<2>)$

See a regex demo and a C# demo.

  • Related