After playing around with some powershell script for a while i was wondering if there is a version of this without using c#. It feels like i am missing some information on how to pipe things properly.
$packages = Get-ChildItem "C:\Users\A\Downloads" -Filter "*.nupkg" |
%{ $_.Name }
# Select-String -Pattern "(?<packageId>[^\d] )\.(?<version>[\w\d\.-] )(?=.nupkg)" |
# %{ @($_.Matches[0].Groups["packageId"].Value, $_.Matches[0].Groups["version"].Value) }
foreach ($package in $packages){
$match = [System.Text.RegularExpressions.Regex]::Match($package, "(?<packageId>[^\d] )\.(?<version>[\w\d\.-] )(?=.nupkg)")
Write-Host "$($match.Groups["packageId"].Value) - $($match.Groups["version"].Value)"
}
Originally i tried to do this with powershell only and thought that with @(1,2,3) you could create an array.
I ended up bypassing the issue by doing the regex with c# instead of powershell, which works, but i am curious how this would have been done with powershell only.
While there are 4 packages, doing just the powershell version produced 8 lines. So accessing my data like $packages[0][0] to get a package id never worked because the 8 lines were strings while i expected 4 arrays to be returned
CodePudding user response:
Terminology note re without using c#: You mean without direct use of .NET APIs. By contrast, C# is just another .NET-based language that can make use of such APIs, just like PowerShell itself.
Note:
The next section answers the following question: How can I avoid direct calls to .NET APIs for my regex-matching code in favor of using PowerShell-native commands (operators, automatic variables)?
See the bottom section for the
Select-String
solution that was your true objective; the tl;dr is:# Note the `, `, which ensures that the array is output *as a single object* %{ , @($_.Matches[0].Groups["packageId"].Value, $_.Matches[0].Groups["version"].Value) }
The PowerShell-native (near-)equivalent of your code is (note tha the assumption is that $package
contains the content of the input file):
# Caveat: -match is case-INSENSITIVE; use -cmatch for case-sensitive matching.
if ($package -match '(?<packageId>[^\d] )\.(?<version>[\w\d\.-] )(?=.nupkg)') {
"$($Matches['packageId']) - $($Matches['Version'])"
}
-match
, the regular-expression matching operator, is the equivalent of[System.Text.RegularExpressions.Regex]::Match()
(which you can shorten to[regex]::Match()
) in that it only looks for (at most) one match.Caveat re case-sensitivity:
-match
(and its rarely used alias-imatch
) is case-insensitive by default, as all PowerShell operators are; for case-sensitive matching, use thec
-prefixed variant,-cmatch
.By contrast, .NET APIs are case-sensitive by default; you'd have to pass the
[System.Text.RegularExpressions.RegexOptions]::IgnoreCase
flag to[regex]::Match()
for case-insensitive matching (you may use'IgnoreCase'
, which PowerShell auto-converts for you).As of PowerShell 7.2.x, there is no operator that is the equivalent of the related return-ALL-matches .NET API,
[regex]::Matches()
. See GitHub issue #7867 for a green-lit but yet-to-be-implemented proposal to introduce one, named-matchall
.
However, instead of directly returning an object describing what was (or wasn't) matched,
-match
returns a Boolean, i.e.$true
or$false
, to indicate whether matching succeeded.Only if
-match
returns$true
does information about a match become available, namely via the automatic$Matches
variable, which is a hashtable reflecting the matching parts of the input string: entry0
is always the full match, with optional additional entries reflecting what any capture groups ((...)
) captured, either by index, if they're anonymous (starting with1
) or, as in your case, for named capture groups ((?<name>...)
) by name.Syntax note: Given that PowerShell allows use of dot notation (property-access syntax) even with hashtables, the above command could have used
$Matches.packageId
instead of$Matches['packageId']
, for instance, which also works with the numeric (index-based) entries, e.g.,$Matches.0
instead of$Matches[0]
Caveat: If an array (enumerable) is used as the LHS operand,
-match
' behavior changes:$Matches
is not populated.- filtering is performed; that is, instead of returning a Boolean indicating whether matching succeeded, the subarray of matching input strings is returned.
Note that the
$Matches
hashtable only provides the matched strings, not also metadata such as index and length, as found in[regex]::Match()
's return object, which is of type[System.Text.RegularExpressions.Match]
.
Select-String
solution:
$packages |
Select-String '(?<packageId>[^\d] )\.(?<version>[\w\d\.-] )(?=.nupkg)' |
ForEach-Object {
"$($_.Matches[0].Groups['packageId'].Value) - $($_.Matches[0].Groups['version'].Value)"
}
Select-String
outputsMicrosoft.PowerShell.Commands.MatchInfo
instances, whose.Matches
collection contains one or more[System.Text.RegularExpressions.Match]
instances, i.e. instances of the same type as returned by[regex]::Match()
- Unless
-AllMatches
is also passed,.Matches
only ever has one entry, hence the use of[0]
to target that entry above.
- Unless
As you can see, working with Select-Object
's output objects requires you to ultimately work with the same .NET type as when you call [regex]::Match()
directly.
However, no method calls are required, and discovering the properties of the output objects is made easy in PowerShell via the Get-Member
cmdlet.
If you want to capture the matches in a jagged array:
$capturedStrings = @(
$packages |
Select-String '(?<packageId>[^\d] )\.(?<version>[\w\d\.-] )(?=.nupkg)' |
ForEach-Object {
# Output an array of all capture-group matches,
# *as a single object* (note the `, `)
, $_.Matches[0].Groups.Where({ $_.Name -ne '0' }).Value
}
)
This returns an array of arrays, each element of which is the array of capture-group matches for a given package, so that $capturedStrings[0][0]
returns the packageId
value for the first package, for instance.
Note:
$_.Matches[0].Groups.Where({ $_.Name -ne '0' }).Value
programmatically enumerates all capture-group matches and returns an their.Value
property values as an array, using member-access enumeration; note how name'0'
must be excluded, as it represents the whole match.With the capture groups in your specific regex, the above is equivalent to the following, as shown in a commented-out line in your question:
@($_.Matches[0].Groups['packageId'].Value, $_.Matches[0].Groups['version'].Value)
, ...
, the unary form of the array-construction operator, is used as a shortcut for outputting the array (symbolized by...
here) as a whole, as a single object. By default, enumeration would occur and the elements would be emitted one by one., ...
is in effect a shortcut to the conceptually clearerWrite-Output -NoEnumerate ...
- see this answer for an explanation of the technique.Additionally,
@(...)
, the array subexpression operator is needed in order to ensure that a jagged array (nested array) is returned even in the event that only one array is returned across all$packages
.