I am formatting lists. Nearly all the issues have been fixed. But there's trouble with the following:
$line = "137 ARBICKLE, Dougal Bruce
20 Every Street Some Town, Musician 138 ARBUCKLE, Edith"
$change = $line -replace '([\n\r])(\d)', ' $2'
$change
Here the sample text has NO backtick for ignoring newlines. I have simply pasted in the way it appears in original file. Because I need to replace newlines that are breaking up the text within the output lines. The regex was tested here
Output is:
20 Every Street Some Town, Musician 138 ARBUCKLE, Edith
After reading about_Quoting_Rules I tried single quotes but no dice.
I don't understand. I expected this output:
137 ARBICKLE, Dougal Bruce 20 Every Street Some Town, Musician 138 ARBUCKLE, Edith
CodePudding user response:
Use the following regex instead:
$line -replace '\r?\n(?=\d)', ' '
\r?\n
matches both CRLF and LF-only newlines and avoids your original problem (see below).Also, using a look-ahead assertion (
(?=...)
) to match the adjacent character avoids the need for capture groups.
As for what you tried:
Your
$line
string contains a Windows-format CRLF newline.[\n\r]
matches only one character, which means that only\n
(the LF) was captured by your regex, leaving the\r
(CR) behind in the string.
The stray CR then resulted in broken display of the result, because when the CR is printed, the cursor position is reset to the first column on the same line, with the remainder of the string being printed there.
Here's a simple demonstration of the problem:
"foo!`rbar" # -> *prints as* 'bar!'