Home > Software design >  RegEx named group with anchors and wildcard for characters captured
RegEx named group with anchors and wildcard for characters captured

Time:10-02

Given a string of

$string = '* X64 Dynamo Core 1.3:  [DYN GUID !Uninstall (KEY MSIE PARAM)]|HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\{F1AA809A-3D47-4FB9-8854-93E070C66A20}|'

I need to capture two named groups. Everything inside the | | at the end is a reference path, another key that relates to the one that produced this message. And everything before the | | is the message for this key. The | | is extraneous, and is only needed to demarcate the reference path. Ideally I would like to end up with

$result.Group.message > * X64 Dynamo Core 1.3:  [DYN GUID !Uninstall (KEY MSIE PARAM)]

and

$result.Group.referencePath > HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\{F1AA809A-3D47-4FB9-8854-93E070C66A20}

I tried

[regex]$pattern = '(?<message>^*)(?<referencePath>|*|$)'
$pattern.Match($string)

and

[regex]$pattern = '^(?<message>*)(?<referencePath>{*})$'

and

[regex]$pattern = '^(?<message>*)|(?<referencePath>*)|$'

But all throw the error

Cannot convert value "^(?<message>*)(?<referencePath>{*})$" to type "System.Text.RegularExpressions.Regex". Error: "parsing "^(?<message>*)(?<referencePath>{*})$" - Quantifier {x,y} following nothing."

I think my issue is with the fact that I am capturing any character other than |, and not handling that correctly.

For context, strings like this are stored in an [Ordered] dictionary with the paths as the key. So I need to extract the message and the reference path, then find the index of the reference path, then delete the current key, and add a new key at the index of the reference 1, using just the message. So when I get a report of hundreds of Uninstall Keys being processed, the ones that got skipped (the data I am working with here) is listed directly below the related one that wasn't skipped, rather than potentially many lines away.

CodePudding user response:

The error you're facing while instantiating a new Regex instance are all due to un-escaped quantifier *:

The asterisk (*) matches the previous element zero or more times.

There are no previous elements on all appearances of it on all your regexes and you clearly wanted to match a literal asterisk (\*) instead of using a quantifier.

As for the regex that could work but clearly needs improvement:

$re    = [regex] '(?<message>^\*. ?)\|(?<referencePath>. )(?=\|)'
$match = $re.Match($string)
$match.Groups['message'].Value       # => * X64 Dynamo Core 1.3:  [DYN GUID !Uni....
$match.Groups['referencePath'].Value # => HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Win...

The -match operator can also work fine without the need of instantiating a Regex instance. The captured results are stored in the automatic variable $Matches:

$re = '(?<message>^\*. ?)\|(?<referencePath>. )(?=\|)'
if($string -match $re) {
    $Matches['message']       # => * X64 Dynamo Core 1.3:  [DYN GUID !Uni....
    $Matches['referencePath'] # => HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Win...
}
  • Related