Home > Net >  Why won't PowerShell script find character?
Why won't PowerShell script find character?

Time:02-25

I have a PowerShell script that runs daily and is used to filter through all the files in a folder as there are 2000 and do a find and replace of a character and replace with a linebreak. Character shows as, up arrow character in notepad, an FF character in notepad

I have images below as well

$filename = Get-ChildItem "C:\Scripts\*filename*.*"
$filename | % {
    (gc $_) -replace "","`n`f" | Set-Content $_.fullname
}

As seen, in the code block it doesn't show the arrow, but as text it does. I can do a manual find and replace but when it runs the PowerShell script from the task schedule it doesn't pick anything up to replace it seems. Is there a different way of going about this?

Any help is appreciated!

Code Snippet Screenshot

CodePudding user response:

From what I can see in tests, by default Get-Content doesn't default to getting content in Unicode format, so I'd guess it's defaulting to ASCII (but don't know for sure).

So in your script you'll want to specify the encoding to force it to use that. I'd also suggest referencing the specific character via it's UNICODE number rather than the symbol, that way you don't need to worry about the format of the script file, or the editor you're using. The following should do what you need (or at least does on my machine).

$filename = Get-ChildItem "C:\Scripts\unicode.txt"
$filename | % {
    (gc -Encoding utf8 $_) -replace "\u2191","`n`r" | Set-Content $_.fullname
}

CodePudding user response:

Okay, I think I figured out what character you are having problems with. It appears to be 0x0C. I used HxD to edit a text file and placed all the characters from 0x00 to 0x1F in there and found 0x0C was the fat up arrow.

I've rewrote this code to what should work. But if it doesn't then try Keith's answer and replace the "\u2191" with "\u000C", or if that doesn't work, maybe try "\u0C" in his code. Not sure how RegEx works with unicode, but it should be something like this.

(gc $_) -replace "$([char]0x0C)","`n" | Set-Content $_.fullname

EDIT:

In the comments below mklement0 pointed out that this should be the Form Feed character. That being the case, then this version should work.

(gc $_) -replace "`f","`n" | Set-Content $_.fullname

But if none these work, then don't forget to try Keith's version - and try both the "`f" and the Unicode "\u000C" variations with his code.

  • Related