I have a text file of 3 name entries:
# dot_test.txt
001 AALTON, Alan .....25 Every Street
006 JOHNS, Jason .... 3 Steep Street
002 BROWN. James .... 101 Browns Road
My task is to find instances of NAME. when it should be NAME, using the following:
Select-String -AllMatches -Path $input_path -Pattern '(?s)[A-Z]{3}.*?\D(?=\s|$)' -CaseSensitive |
ForEach-Object { if($_.Matches.Value -match '\.$'){$_.Matches.Value -replace '\,$'} }
The output is:
BROWN.
The conclusion is this script block identifies the instance of NAME. but fails to make the replacement.
Any suggestions on how to achieve this would be appreciated.
CodePudding user response:
$_.Matches.Value -replace '\,$'
This attempts to replace a ,
(which you needn't escape as \,
) at the end of ($
) your match with the empty string (due to the absence of a second, replacement operand), i.e. it would effectively remove a trailing ,
.
However, given that your match contains no ,
and that you instead want to replace its trailing .
with ,
, use the following:
$_.Matches.Value -replace '\.$', ',' # -> 'BROWN,'
CodePudding user response:
You can use -replace
directly, and if you need to replace both a comma and dot at the end of the string, use [.,]$
regex:
Select-String -AllMatches -Path $input_path -Pattern '(?s)[A-Z]{3}.*?\D(?=\s|$)' -CaseSensitive | % {$_.Matches.Value -replace '\.$', ','}
Details:
(?s)[A-Z]{3}.*?\D(?=\s|$)
- matches(?s)
-RegexOptions.Singleline
mode on and.
can match line breaks[A-Z]{3}
- three uppercase ASCII letters.*?
- any zero or more chars as few as possible\D
- any non-digit char(?=\s|$)
- a positive lookahead that matches a location either immediately followed with a whitespace or end of string.
The \.$
pattern matches a .
at the end of string.