I have a file on my PC called test.ps1
I have a file hosted on my github called test.ps1
both of them have the same contents a string
inside them
I am using the following script to try and comapare them:
$fileA = Get-Content -Path "C:\Users\User\Desktop\test.ps1"
$fileB = (Invoke-webrequest -URI "https://raw.githubusercontent.com/repo/Scripts/test.ps1")
if(Compare-Object -ReferenceObject $fileA -DifferenceObject ($fileB -split '\r?\n'))
{"files are different"}
Else {"Files are the same"}
echo ""
Write-Host $fileA
echo ""
Write-Host $fileB
however my output is showing the exact same data for both but it says the files are different. The output:
files are different
a string
a string
is there some weird EOL thing going on or something?
CodePudding user response:
tl;dr
# Remove a trailing newline from the downloaded file content
# before splitting into lines.
# Parameter names omitted for brevity.
Compare-Object $fileA ($fileB -replace '\r?\n\z' -split '\r?\n' )
If the files are truly identical (save for any character-encoding and newline-format differences, and whether or not the local file has a trailing newline), you'll see no output (because Compare-Object
only reports differences by default).
If the lines look the same, it sounds like character encoding is not the problem, though it's worth pointing out that Get-Content
in Windows PowerShell, in the absence of a BOM, assumes that a file is ANSI-encoded, so a UTF-8 file without BOM that contains characters outside the ASCII range will be misinterpreted - use -Encoding utf8
to fix that.
Assuming that the files are truly identical (including not having variations in whitespace, such as trailing spaces at the end of lines), the likeliest explanation is that the file being retrieved has a trailing newline, as is typical for text files.
Thus, if the downloaded file has a trailing newline, as is to be expected, if you apply -split '\r?\n'
to the multi-line string representing the entire file content in order to split it into lines, you'll end up with an extra, empty array element at the end, which causes Compare-Object
to report that element as a difference.
Compare-Object
emitting an object is evaluated as $true
in the implied Boolean context of your if
statement's conditional, which is why files are different
is output.
The above -replace
operation, -replace '\r?\n\z'
(\z
matches the very end of a (multi-line) string), compensates for that, by removing the trailing newline before splitting into lines.