It seems like Powershell's .Extension
method doesn't recognize double extensions.
As a result, something like this:
Get-ChildItem -Path:. | Rename-Item -NewName {
'{0}{1}' -f ($_.BaseName -replace '\.', '_'), $_.Extension
}
ends up renaming file.1.tar.gz
to file_1_tar.gz
, while I want file_1.tar.gz
.
Just an oversight? Or can it be mitigated?
CodePudding user response:
This is the intended behavior.
Here's a quote from Microsoft Doc. regarding the [System.IO.Path]::GetExtension
method, which seems to share the same implementation logic than other similar functions and Powershell Cmdlets.
This method obtains the extension of path by searching path for a period (.), starting with the last character in path and continuing toward the first character. If a period is found before a DirectorySeparatorChar or AltDirectorySeparatorChar character, the returned string contains the period and the characters after it; otherwise, String.Empty is returned.
You can of course mitigate this by either creating an exception for tar.gz and other double extension by using a dictionary to define what should be treated as double exception. Here is an example of mitigation.
Get-ChildItem -Path 'C:\temp' | % {
$BaseName = $_.BaseName
$Extension = $_.Extension
if ($_.FullName.EndsWith('.tar.gz')) {
$BaseName = $_.BaseName.SubString(0,$_.BaseName.Length -4)
# I used the full name as reference to preserve the case on the filenames
$Extension = $_.FullName.Substring($_.FullName.Length - 7)
}
Rename-Item -Path $_.FullName -NewName (
'{0}{1}' -f ($BaseName -replace '.', '_'), $Extension
)
}
Note that since I had only .tar.gz
, I did not actually use a dictionary but should you have multiple double extension type, the best way would be to apply this method through a loop against each extension to see if it match.
Take this, for instance, which loop through an array to check against multiple double extensions
# Outside of the main loop to avoid redeclaring each time.
$DoubleExtensionDict = @('.tar.gz', '.abc.def')
Get-ChildItem -Path 'C:\temp' | % {
$BaseName = $_.BaseName
$Extension = $_.Extension
Foreach ($ext in $DoubleExtensionDict) {
if ($_.FullName.EndsWith($ext)) {
$FirstExtLength = ($ext.split('.')[1]).Length
$BaseName = $_.BaseName.SubString(0, $_.BaseName.Length - $FirstExtLength -1)
$Extension = $_.FullName.Substring($_.FullName.Length - 7)
break
}
}
Rename-Item -Path $_.FullName -NewName (
'{0}{1}' -f ($BaseName -replace '.', '_'), $Extension
)
}
References
CodePudding user response:
As Sage Pourpre's helpful answer notes, by design only the last .
-separated token is considered the filename extension.
Short of creating a hashtable (dictionary) of known "multi-extensions", as Sage proposes - which would have to anticipate all of them - you can try a heuristic based on the following rules (since you need to rule out considering .1
part of the multi-extension):
- Any nonempty run of
.
-separated tokens that do not start with a digit, at the end of a file name, is considered the multi-extension, including the.
chars:
@{ Name = 'file.1.tar.gz'},
@{ Name = 'file.1.tar.zip'},
@{ Name = 'file.1.html'},
@{ Name = 'file.1.ps1'},
@{ Name = 'other.1'} |
ForEach-Object {
if ($_.Name -match '(. ?)((?:\.[^\d] ))$') {
$basename = $Matches[1]
$exts = $Matches[2]
} else {
$basename = $_.Name
$exts = ''
}
($basename -replace '\.', '_') $exts
}
The above yields:
file_1.tar.gz
file_1.tar.zip
file_1.html
file_1.ps1
other_1