Home > Software engineering >  Regex capture multi matches in Group
Regex capture multi matches in Group

Time:02-22

I'm not sure if this is possible. I'am searching for a way to capture multiple matches in a group.

This it work perfectly fine:

"Catch me if you can" -match "(?=.*(Catch))"
Result: Catch

I would like to have the result of two matches in the group:

"Catch me if you can" -match "(?=.*Catch)(?=.*me)"
Expected Result: Catch me

CodePudding user response:

You're trying to:

  • match two separate regexes,
  • and report what specific strings they matched only if BOTH regexes matched.

As Wiktor Stribiżew notes in a comment, a switch statement with the -Regex switch, which in effect performs multiple matching operations per input string (unless branch script blocks short-circuit branch evaluations with break or continue), is an option:

$matched = switch -Regex ('Catch me if you can') {
  'Catch' { $Matches[0] } # Fall through to next conditional
  'me'    { $Matches[0] }
}
if ($matched.Count -eq 2) { # *both* regexes matched
  $matched -join ' '  # -> 'Catch me'
}

Note: The above reports the regex matches in the order in which the regexes matched based on the order of the switch conditionals, not based on the order in which the matching strings appeared in the input string.


Alternatively, use the [regex]::Matches() .NET API with an alternation (|) construct:

$matched = [regex]::Matches('Catch me if you can', 'Catch|me', 'IgnoreCase').Value
if ($matched.Count -eq 2) { $matched -join ' '}    # -> 'Catch me'

Note the use of the IgnoreCase option, so as to match PowerShell's default behavior of case-insensitive matching.


See also:

  • GitHub issue #7867, which suggests introducing a -matchall operator that returns all matches found in the input string, given that the -match operator only ever looks for one match.

CodePudding user response:

The (?= is a LookAhead, but you don't have it looking ahead of anything. In this example LookAhead is looking ahead of "Catch" to see if it can find ".*me".

Catch(?=.*me)

Also, do you really want to match "catchABCme"? I would think you would want to match "catch ABC me", but not "catchABCme", "catchABC me", or "catch ABCme".

Here is some test code to play with:

$Lines = @(
    'catch ABC me if you can',
    'catch ABCme if you can',
    'catchABC me if you can'
)
$RegExCheckers = @(
    'Catch(?=.*me)',
    'Catch(?=.*\s me)',
    'Catch\s(?=(.*\s )?me)'
)

foreach ($RegEx in $RegExCheckers) {
    $RegExOut = "`"$RegEx`"".PadLeft(22,' ')
    foreach ($Line in $Lines) {
        $LineOut = "`"$Line`"".PadLeft(26,' ')
        if($Line -match $RegEx) {
            Write-Host "$RegExOut        matches $LineOut"
        } else {
            Write-Host "$RegExOut didn't match   $LineOut"
        }
    }
    Write-Host
}

And here is the output:

        "Catch(?=.*me)"        matches  "catch ABC me if you can"
        "Catch(?=.*me)"        matches   "catch ABCme if you can"
        "Catch(?=.*me)"        matches   "catchABC me if you can"

     "Catch(?=.*\s me)"        matches  "catch ABC me if you can"
     "Catch(?=.*\s me)" didn't match     "catch ABCme if you can"
     "Catch(?=.*\s me)"        matches   "catchABC me if you can"

"Catch\s(?=(.*\s )?me)"        matches  "catch ABC me if you can"
"Catch\s(?=(.*\s )?me)" didn't match     "catch ABCme if you can"
"Catch\s(?=(.*\s )?me)" didn't match     "catchABC me if you can"

As you can see, the last RegEx expression requires a space after "catch" and before "me".

Also, a great place to test RegEx is regex101.com, you can place the RegEx at the top and multiple lines you want to test it against in the box in the middle.

  • Related