Within each text file, being worked with, there are two pages of listings. They are referenced identically using the same reference numbers (001,002, ... etc) The aim is to separate these two listings and store separately in an array. I have simplified the problem for testing.
# short listings (an unhelpful text file listing structure. But ain't that life? )
# "001 First in listing one 002 Second in listing one 003 Third in listing one 001 First in listing two 002 Second in listing two 003 Third in listing two
# read shortListings from file
$listings = Get-Content -Path C:\test\test6\shortListings.txt
[string[]]$result = $null
# regular expression where I am trying to separate listing 'ones' & listing 'twos' into string array $result
# $result[0] = "001 First in listing one 002 Second in listing one 003 Third in listing one"
# $result[1] = "001 First in listing two 002 Second in listing two 003 Third in listing two"
$regex = '(?s)(001.*?)((?=001.*?))'
$result = $listings | Select-String -Pattern $regex -AllMatches | ForEach-Object { $_.Matches.Value}
# Okay. That's listing one. But how do I get listing two?
$result[0]
CodePudding user response:
Here is one way this could be done, using a combination of -split
and Group-Object
:
$result = (Get-Content shortListings.txt -Raw) -split '\s*(?=\d{3})' -ne '' |
Group-Object { [regex]::Match($_, '\w $').Value } -AsHashTable -AsString
However using this method, the Values of the Hashtable would be an array of strings instead of a single string. You can however -join
them later:
PS ..\pwsh> $result
Name Value
---- -----
one {001 First in listing one, 002 Second...
two {001 First in listing two, 002 Second...
PS ..\pwsh> $result['one']
001 First in listing one
002 Second in listing one
003 Third in listing one
PS ..\pwsh> $result['one'] -join ' '
001 First in listing one 002 Second in listing one 003 Third in listing one
CodePudding user response:
# read shortListings from file
$listings = Get-Content -Path C:\test\test6\shortListings.txt
# regex produces 3 groups 0=complete listing 1=listings in one 2=listings in two
$regex = '(?s)(001.*?)(?=001.*?)(001.*?$)'
# access group matches using array $line
[string[]]$line = $null
$line = $listings | Select-String -AllMatches -Pattern $regex | ForEach-Object {$_.Matches.groups}
for ($i = 0; $i -lt $line.Count; $i ) {
if ($i -eq 0) {
# do nothing for complete listings
}
else {
Write-Host "group:"$i $line[$i]
}
}
Output:
group: 1 001 First in listing one 002 Second in listing one 003 Third in listing one
group: 2 001 First in listing two 002 Second in listing two 003 Third in listing two