I'm performing some pretty straightforward checksums on our linux boxes, but I now need to recreate something similar for our windows users. To give me a single checksum, I just run:
md5sum *.txt | awk '{ print $1 }' | md5sum
I'm struggling to recreate this in Windows, either with a batch file or Powershell. The closest I've got is:
Get-ChildItem $path -Filter *.txt |
Foreach-Object {
$hash = Get-FileHash -Algorithm MD5 -Path ($path "\" $_) | Select -ExpandProperty "Hash"
$hash = $hash.tolower() #Get-FileHash returns checksums in uppercase, linux in lower case (!)
Write-host $hash
}
This will print the same checksum results for each file to the console as the linux command, but piping that back to Get-FileHash to get a single output that matches the linux equivalent is eluding me. Writing to a file gets me stuck with carriage return differences
Streaming as a string back to Get-FileHash doesn't return the same checksum:
$String = Get-FileHash -Algorithm MD5 -Path (Get-ChildItem -path $files -Recurse) | Select -ExpandProperty "Hash"
$stringAsStream = [System.IO.MemoryStream]::new()
$writer = [System.IO.StreamWriter]::new($stringAsStream)
$writer.write($stringAsStream)
Get-FileHash -Algorithm MD5 -InputStream $stringAsStream
Am I over-engineering this? I'm sure this shouldn't be this complicated! TIA
CodePudding user response:
You need to reference the .Hash
property on the returned object from Get-FileHash
. If you want a similar view to md5hash
, you can also use Select-Object
to curate this:
# Get filehashes in $path with similar output to md5sum
$fileHashes = Get-ChildItem $path -File | Get-FileHash -Algorithm MD5
# Once you have the hashes, you can reference the properties as follows
# .Algorithm is the hashing algo
# .Hash is the actual file hash
# .Path is the full path to the file
foreach( $hash in $fileHashes ){
"$($hash.Algorithm):$($hash.Hash) ($($hash.Path))"
}
For each file in $path
, the above foreach
loop will produce a line that similar to:
MD5:B4976887F256A26B59A9D97656BF2078 (C:\Users\username\dl\installer.msi)
The algorithm, hash, and filenames will obviously differ based on your selected hashing algorithm and filesystem.
CodePudding user response:
The devil is in the details:
- (known already)
Get-FileHash
returns checksums in uppercase while Linuxmd5sum
in lower case (!); - The FileSystem provider's filter
*.txt
is not case sensitive in PowerShell while in Linux depends on the optionnocaseglob
. If set (shopt -s nocaseglob
) then Bash matches filenames in a case-insensitive fashion when performing filename expansion. Otherwise (shopt -u nocaseglob
), filename matching is case-sensitive; - Order:
Get-ChildItem
output is ordered according to Unicode collation algorithm while in Linux*.txt
filter is expanded in order ofLC_COLLATE
category (LC_COLLATE="C.UTF-8"
on my system).
In the following (partially commented) script, three # Test
blocks demonstrate my debugging steps to the final solution:
Function Get-StringHash {
[OutputType([System.String])]
param(
# named or positional: a string
[Parameter(Position=0)]
[string]$InputObject
)
$stringAsStream = [System.IO.MemoryStream]::new()
$writer = [System.IO.StreamWriter]::new($stringAsStream)
$writer.write( $InputObject)
$writer.Flush()
$stringAsStream.Position = 0
Get-FileHash -Algorithm MD5 -InputStream $stringAsStream |
Select-Object -ExpandProperty Hash
$writer.Close()
$writer.Dispose()
$stringAsStream.Close()
$stringAsStream.Dispose()
}
function ConvertTo-Utf8String {
[OutputType([System.String])]
param(
# named or positional: a string
[Parameter(Position=0, Mandatory = $false)]
[string]$InputObject = ''
)
begin {
$InChars = [char[]]$InputObject
$InChLen = $InChars.Count
$AuxU_8 = [System.Collections.ArrayList]::new()
}
process {
for ($ii= 0; $ii -lt $InChLen; $ii ) {
if ( [char]::IsHighSurrogate( $InChars[$ii]) -and
( 1 $ii) -lt $InChLen -and
[char]::IsLowSurrogate( $InChars[1 $ii]) ) {
$s = [char]::ConvertFromUtf32(
[char]::ConvertToUtf32( $InChars[$ii], $InChars[1 $ii]))
$ii
} else {
$s = $InChars[$ii]
}
[void]$AuxU_8.Add(
([System.Text.UTF32Encoding]::UTF8.GetBytes($s) |
ForEach-Object { '{0:X2}' -f $_}) -join ''
)
}
}
end { $AuxU_8 -join '' }
}
# Set variables
$hashUbuntu = '5d944e44149fece685d3eb71fb94e71b'
$hashUbuntu <# copied from 'Ubuntu 20.04 LTS' in Wsl2:
cd `wslpath -a 'D:\\bat'`
md5sum *.txt | awk '{ print $1 }' | md5sum | awk '{ print $1 }'
<##>
$LF = [char]0x0A # Line Feed (LF)
$path = 'D:\Bat' # testing directory
$filenames = 'D:\bat\md5sum_Ubuntu_awk.lst'
<# obtained from 'Ubuntu 20.04 LTS' in Wsl2:
cd `wslpath -a 'D:\\bat'`
md5sum *.txt | awk '{ print $1 }' > md5sum_Ubuntu_awk.lst
md5sum md5sum_Ubuntu_awk.lst | awk '{ print $1 }' # for reference
<##>
# Test #1: is `Get-FileHash` the same (beyond character case)?
$hashFile = Get-FileHash -Algorithm MD5 -Path $filenames |
Select-Object -ExpandProperty Hash
$hashFile.ToLower() -ceq $hashUbuntu
# Test #2: is `$stringToHash` well-defined? is `Get-StringHash` the same?
$hashArray = Get-Content $filenames -Encoding UTF8
$stringToHash = ($hashArray -join $LF) $LF
(Get-StringHash -InputObject $stringToHash) -eq $hashUbuntu
# Test #3: another check: is `Get-StringHash` the same?
Push-Location -Path $path
$filesInBashOrder = bash.exe -c "ls -1 *.txt"
$hashArray = $filesInBashOrder |
Foreach-Object {
$hash = Get-FileHash -Algorithm MD5 -Path (
Join-Path -Path $path -ChildPath $_) |
Select-Object -ExpandProperty "Hash"
$hash.tolower()
}
$stringToHash = ($hashArray -join $LF) $LF
(Get-StringHash -InputObject $stringToHash) -eq $hashUbuntu
Pop-Location
# Solution - ordinal order assuming `LC_COLLATE="C.UTF-8"` in Linux
Push-Location -Path $path
$hashArray = Get-ChildItem -Filter *.txt -Force -ErrorAction SilentlyContinue |
Where-Object {$_.Name -clike "*.txt"} | # only if `shopt -u nocaseglob`
Sort-Object -Property { (ConvertTo-Utf8String -InputObject $_.Name) } |
Get-FileHash -Algorithm MD5 |
Select-Object -ExpandProperty "Hash" |
Foreach-Object {
$_.ToLower()
}
$stringToHash = ($hashArray -join $LF) $LF
(Get-StringHash -InputObject $stringToHash).ToLower() -ceq $hashUbuntu
Pop-Location
Output (tested on 278 files): .\SO\69181414.ps1
5d944e44149fece685d3eb71fb94e71b
True
True
True
True