I have a file with git commits,
c2b7b1913 Merged PR 38064: Lorem ipsum dolor sit amet
baa810a57 Merged PR 37937: Lorem ipsum dolor sit amet
8d67d563c Merged PR 37825: Lorem ipsum dolor sit amet
2a061da0b Merged PR 37494: Lorem ipsum dolor sit amet
How do I use powershell to get just the PR number, i.e. I would like
38064
37937
37825
37494
Here is my attempts
Get-Content .\testdata.txt `
| Select-String -Pattern "^.*Merged PR (\d{5}):.*$" -AllMatches `
| ForEach-Object {$_.Matches.Groups}
Which seems to return the correct data. Below is the output. But how do I get to the regex group?
Groups : {0, 1}
Success : True
Name : 0
Captures : {0}
Index : 0
Length : 54
Value : c2b7b1913 Merged PR 38064: Lorem ipsum dolor sit amet
Success : True
Name : 1
Captures : {1}
Index : 21
Length : 5
Value : 38064
Groups : {0, 1}
Success : True
Name : 0
Captures : {0}
Index : 0
Length : 54
Value : baa810a57 Merged PR 37937: Lorem ipsum dolor sit amet
Success : True
Name : 1
Captures : {1}
Index : 21
Length : 5
Value : 37937
Groups : {0, 1}
Success : True
Name : 0
Captures : {0}
Index : 0
Length : 54
Value : 8d67d563c Merged PR 37825: Lorem ipsum dolor sit amet
Success : True
Name : 1
Captures : {1}
Index : 21
Length : 5
Value : 37825
Groups : {0, 1}
Success : True
Name : 0
Captures : {0}
Index : 0
Length : 54
Value : 2a061da0b Merged PR 37494: Lorem ipsum dolor sit amet
Success : True
Name : 1
Captures : {1}
Index : 21
Length : 5
Value : 37494
Here is the equivalent sed
command
sed -nE 's/^.*Merged PR ([[:digit:]]{5}):.*$/\1/p'
CodePudding user response:
You can get them by accessing .Groups[1].Value
property:
Get-Content Get-Content .\testdata.txt `
| Select-String -Pattern "Merged PR (\d{5}):" -AllMatches `
| ForEach-Object {$_.Matches.Groups[1].Value}
Note the ^.*
and .*$
parts are not necessary because PowerShell regex matching does not require a complete string to match the pattern.
With Merged PR (\d{5}):
regex, you match Merged PR
substring and capture into a separate group five digits (with (\d{5})
) that are immediately followed with a :
char. So, once captured, you just access the right group value in the code.
CodePudding user response:
You could simply use split:
gc C:\tmp\testdata.txt | %{($_ -split " ")[3] -replace ":"}
Or back to your version:
$data = gc C:\tmp\testdata.txt | Select-String -Pattern "(\d{5})" -AllMatches
$data.matches.groups.value
CodePudding user response:
You don't need to use Get-Content
, Select-String
supports the path parameter:
Select-String -Path <path> -Pattern '(?<=PR\s)[0-9] ' | ForEach-Object { $_.Matches.value }
(?<=PR\s)[0-9]
(?<=PR\s) : Find PR in string with 1 whitespace but dont include it in result, example - 'PR '
[0-9] : 1 or more digits
Example:
<#
File contents ion C:\tmp\testmsg.txt
c2b7b1913 Merged PR 38064: Lorem ipsum dolor sit amet
baa810a57 Merged PR 37937: Lorem ipsum dolor sit amet
8d67d563c Merged PR 37825: Lorem ipsum dolor sit amet
2a061da0b Merged PR 37494: Lorem ipsum dolor sit amet
#>
Select-String -Path C:\tmp\testmsg.txt -Pattern '(?<=PR\s)[0-9] ' | ForEach-Object { $_.Matches.Value }
38064
37937
37825
37494