I have a large file of streets in towns. With a format like this:
ACACIA AVENUE,
Bridral Town
Sandhurstpool City
ACCENT DRIVE, Sandhurstpool City
BERWICK ROAD, Bridral Town
For economy I want to find instances of the town and then retrieve the streets in that town. I want to avoid street names that are NOT in that town. My code so far is:
$listing = 'streetNameTest.txt'
$town = 'Bridral'
$regex = '(\b[A-Z]{2,}\b[$\s])'
# find the town & then the street names of that town
Select-String -Pattern $town -Path $listing -CaseSensitive |
# for each find two scenarios: (1) town & STREET (2) town only (where STREET listed on earlier line)
ForEach-Object {
# check this line for UPPERCASE indicating street name
if ($line = ($_ | Select-String $regex -CaseSensitive)) {
Write-Host "Found street-name on same line as town-name:" $line
}
# else need to shift back, line by line, until UPPERCASE indicating street name found
else {
$lineNum = ($_ | Select-Object -ExpandProperty LineNumber)
Write-Host Found town-name on line:$lineNum "The street-name will be on a line before town name!"
$line = ($_ | Select-String $town -CaseSensitive)
Write-Host The line for object is $line
}
}
My output is:
Found town-name on line:2 The street-name will be on a line before town name!
The line for object is C:\streetNameTest.txt:2: Bridral Town
Found street-name on same line as town-name: C:\streetNameTest.txt:5:BERWICK ROAD, Bridral Town
I am stuck on the else
branch. I am attempting to deal with instances of the same street name in different towns. I need to step back from the Bridral Town
line to the ARCACIA AVENUE,
line of my source file. I have the line number but can't figure out a way to use it in order to decrement back (checking using my UPPERCASE $regex
variable)
Any suggestions appreciated.
CodePudding user response:
Per comments, you might be better breaking your logic into two parts - the first being a simple parser that extracts the single-line or multi-line addresses from your input, and a second part that inspects each address to see if it matches your criteria.
That way you don’t need to keep “reversing up” the file to re-process previous lines. It also makes the two behaviours easier to test and debug in isolation.
Here's a naive function that converts the input lines into addresses (note there's no error handling and I've had to make some assumptions based on the limited dataset in your question):
function ConvertTo-Addresses
{
param( [string[]] $Lines )
# assumed file format:
#
# * an address can be single-line or multi-line
# * single-line addresses are in the format "<STREET NAME>[, <town name> Town][, <city name> City]"
# * multi-line addresses are in the format "<STREET NAME>," followed by one or more lines of:
# * "[space] [space] <town name> Town"
# * "[space] [space] <city name> City"
$i = 0;
while( $i -lt $Lines.Length )
{
$address = [pscustomobject] ([ordered] @{
"Street" = $null
"Town" = $null
"City" = $null
});
#write-host $Lines[$i];
if( $Lines[$i].EndsWith(",") )
{
# multi-line address
#write-host "new multi-line address"
$address.Street = $Lines[$i].Substring(0, $Lines[$i].IndexOf(","))
$i ;
while( $Lines[$i].StartsWith(" ") )
{
$line = $Lines[$i].Trim();
if( $line.EndsWith("Town") )
{
$address.Town = $line.Substring(0, $line.Length - "Town".Length).Trim();
}
elseif( $line.EndsWith("City") )
{
$address.City = $line.Substring(0, $line.Length - "City".Length).Trim();
}
$i ;
}
}
else
{
# single-line address
#write-host "new single-line address"
$parts = $Lines[$i].Split(",");
$j = 0;
$address.Street = $parts[$j];
$j ;
while( $j -lt $parts.Length )
{
$line = $parts[$j].Trim();
if( $line.EndsWith("Town") )
{
$address.Town = $line.Substring(0, $line.Length - "Town".Length).Trim();
}
elseif( $line.EndsWith("City") )
{
$address.City = $line.Substring(0, $line.Length - "City".Length).Trim();
}
$j ;
}
$i ;
}
#write-host ($address | ConvertTo-Json)
write-output $address;
}
}
Using it for the second part is pretty straightforward:
$text = @"
ACACIA AVENUE,
Bridral Town
Sandhurstpool City
ACCENT DRIVE, Sandhurstpool City
BERWICK ROAD, Bridral Town
"@;
$addresses = ConvertTo-Addresses $text.Split("`n");
$addresses | ft;
# Street Town City
# ------ ---- ----
# ACACIA AVENUE Bridral Sandhurstpool
# ACCENT DRIVE Sandhurstpool
# BERWICK ROAD Bridral
$bridral = $addresses | where-object { $_.Town -eq "Bridral" }
$bridral | ft;
# Street Town City
# ------ ---- ----
# ACACIA AVENUE Bridral Sandhurstpool
# BERWICK ROAD Bridral
The only down side is performance might be slower than your original code, but tbh, unless you're processing tens of millions of addresses it's probably better to have maintainable code than fast code.