I have this script which parses all shares on a file server to gather information on share size, ACLs, and count of files and folders. The script works great on smaller file servers but on hosts with large shares it consumes all RAM and crashes the host, I can't seem to figure out how to optimize the script during the Get-ChildItem portion to not consume all RAM.
I found a few articles which mentioned to use a foreach loop and pipe out what I need. I am a Powershell beginner, I can't figure out how to get it to work like that. What can I try next?
$ScopeName = Read-Host "Enter scope name to gather data on"
$SavePath = Read-Host "Path to save results and log to"
$SaveCSVPath = "$SavePath\ShareData.csv"
$TranscriptLog = "$SavePath\Transcript.log"
Write-Host
Start-Transcript -Path $TranscriptLog
$StartTime = Get-Date
$Start = $StartTime | Select-Object -ExpandProperty DateTime
$Exclusions = {$_.Description -ne "Remote Admin" -and $_.Description -ne "Default Share" -and $_.Description -ne "Remote IPC" }
$FileShares = Get-SmbShare -ScopeName $ScopeName | Where-Object $Exclusions
$Count = $FileShares.Count
Write-Host
Write-Host "Gathering data for $Count shares" -ForegroundColor Green
Write-Host
Write-Host "Results will be saved to $SaveCSVPath" -ForegroundColor Green
Write-Host
ForEach ($FileShare in $FileShares)
{
$ShareName = $FileShare.Name
$Path = $Fileshare.Path
Write-Host "Working on: $ShareName - $Path" -ForegroundColor Yellow
$GetObjectInfo = Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue
$ObjSize = $GetObjectInfo | Measure-Object -Property Length -Sum -ErrorAction SilentlyContinue
$ObjectSizeMB = "{0:N2}" -f ($ObjSize.Sum / 1MB)
$ObjectSizeGB = "{0:N2}" -f ($ObjSize.Sum / 1GB)
$ObjectSizeTB = "{0:N2}" -f ($ObjSize.Sum / 1TB)
$NumFiles = ($GetObjectInfo | Where-Object {-not $_.PSIsContainer}).Count
$NumFolders = ($GetObjectInfo | Where-Object {$_.PSIsContainer}).Count
$ACL = Get-Acl -Path $Path
$LastAccessTime = Get-ItemProperty $Path | Select-Object -ExpandProperty LastAccessTime
$LastWriteTime = Get-ItemProperty $Path | Select-Object -ExpandProperty LastWriteTime
$Table = [PSCustomObject]@{
'ScopeName' = $FileShare.ScopeName
'Sharename' = $ShareName
'SharePath' = $Path
'Owner' = $ACL.Owner
'Permissions' = $ACL.AccessToString
'LastAccess' = $LastAccessTime
'LastWrite' = $LastWriteTime
'Size (MB)' = $ObjectSizeMB
'Size (GB)' = $ObjectSizeGB
'Size (TB)' = $ObjectSizeTB
'Total File Count' = $NumFiles
'Total Folder Count' = $NumFolders
'Total Item Count' = $GetObjectInfo.Count
}
$Table | Export-CSV -Path $SaveCSVPath -Append -NoTypeInformation
}
$EndTime = Get-Date
$End = $EndTime | Select-Object -ExpandProperty DateTime
Write-Host
Write-Host "Script start time: $Start" -ForegroundColor Green
Write-Host "Script end time: $End" -ForegroundColor Green
Write-Host
$ElapsedTime = $(($EndTime-$StartTime))
Write-Host "Elapsed time: $($ElapsedTime.Days) Days $($ElapsedTime.Hours) Hours $($ElapsedTime.Minutes) Minutes $($ElapsedTime.Seconds) Seconds $($ElapsedTime.MilliSeconds) Milliseconds" -ForegroundColor Cyan
Write-Host
Write-Host "Results saved to $SaveCSVPath" -ForegroundColor Green
Write-Host
Write-Host "Transcript saved to $TranscriptLog" -ForegroundColor Green
Write-Host
Stop-Transcript
CodePudding user response:
To correctly use the PowerShell pipeline (and preserve memory as each item is streamed separately), use the PowerShell ForEach-Object cmdlet (unlike the ForEach
statement) and avoid assigning the pipeline to a variable (as you doing with $FileShares = ...
) and don't use parenthesis ((...)
) arround the the pipeline:
Get-SmbShare -ScopeName $ScopeName | Where-Object $Exclusions | ForEach-Object {
And replace all $FileShare
variables in your loop with the current item: $_
variable (e.g. $FileShare.Name
→ $_.Name
).
For the Get-Childitem
part you might do the same thing (stream! meaning: use the mighty PowerShell pipeline rather than piling everything up in $GetObjectInfo
):
$ObjSize = Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue |
Measure-Object -Property Length -Sum -ErrorAction SilentlyContinue
As an aside; you might simplify your 3 size properties to a single smarter size property, see: How to convert value to KB, MB, or GB depending on digit placeholders?
CodePudding user response:
You are buffering the entire collection of [FileSystemInfo]
on $FileShare
into a variable with...
$GetObjectInfo = Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue
So, if there's a million directories and files on that share then that's a million [FileSystemInfo]
instances stored in a million-element array, none of which can be garbage collected during that iteration of the foreach
loop. You can use Group-Object
to improve that a bit...
$groupsByPSIsContainer = Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue |
Group-Object -Property 'PSIsContainer' -AsHashTable
# $groupsByPSIsContainer is a [Hashtable] with two keys:
# - $true gets the collection of directories
# - $false gets the collection of files
$ObjSize = $groupsByPSIsContainer[$false] | Measure-Object -Property Length -Sum -ErrorAction SilentlyContinue
$NumFiles = $groupsByPSIsContainer[$false].Count
$NumFolders = $groupsByPSIsContainer[$true].Count
...but that still ends up storing all of the [FileSystemInfo]
s in the two branches of the [Hashtable]
. Instead, I would just enumerate and count the results myself...
$ObjSize = 0L # Stores the total file size directly; use $ObjSize instead of $ObjSize.Sum
$NumFiles = 0
$NumFolders = 0
foreach ($fileSystemInfo in Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue)
{
if ($fileSystemInfo.PSIsContainer)
{
$NumFolders
}
else
{
$NumFiles
$ObjSize = $fileSystemInfo.Length
}
}
That stores only the current enumeration result in $fileSystemInfo
and never the entire sequence.
Note that if you weren't summing the files' sizes Group-Object
would work well...
$groupsByIsContainer = Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue |
Group-Object -Property 'PSIsContainer' -NoElement
$NumFiles = ($groupsByIsContainer | Where-Object -Property 'Name' -EQ -Value $false).Count
$NumFolders = ($groupsByIsContainer | Where-Object -Property 'Name' -EQ -Value $true ).Count
-NoElement
prevents the resulting group objects from storing the grouped elements; we just care about the count of members in each grouping but not the members themselves. If we passed -AsHashTable
then we'd lose the convenient Count
property, hence why the two groups have to be accessed in this awkward way.