Home > Back-end >  How to optimize Powershell script with Get-ChildItem consuming all RAM
How to optimize Powershell script with Get-ChildItem consuming all RAM

Time:08-15

I have this script which parses all shares on a file server to gather information on share size, ACLs, and count of files and folders. The script works great on smaller file servers but on hosts with large shares it consumes all RAM and crashes the host, I can't seem to figure out how to optimize the script during the Get-ChildItem portion to not consume all RAM.

I found a few articles which mentioned to use a foreach loop and pipe out what I need. I am a Powershell beginner, I can't figure out how to get it to work like that. What can I try next?

$ScopeName     = Read-Host "Enter scope name to gather data on"
$SavePath      = Read-Host "Path to save results and log to"
$SaveCSVPath   = "$SavePath\ShareData.csv"
$TranscriptLog = "$SavePath\Transcript.log"

Write-Host
Start-Transcript -Path $TranscriptLog

$StartTime = Get-Date
$Start     = $StartTime | Select-Object -ExpandProperty DateTime

$Exclusions = {$_.Description -ne "Remote Admin" -and $_.Description -ne "Default Share" -and $_.Description -ne "Remote IPC" }
$FileShares = Get-SmbShare -ScopeName $ScopeName | Where-Object $Exclusions
$Count      = $FileShares.Count
Write-Host
Write-Host "Gathering data for $Count shares" -ForegroundColor Green
Write-Host
Write-Host "Results will be saved to $SaveCSVPath" -ForegroundColor Green
Write-Host

ForEach ($FileShare in $FileShares)
{
    $ShareName = $FileShare.Name
    $Path      = $Fileshare.Path

    Write-Host "Working on: $ShareName - $Path" -ForegroundColor Yellow
    
    $GetObjectInfo = Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue

    $ObjSize = $GetObjectInfo | Measure-Object -Property Length -Sum -ErrorAction SilentlyContinue

    $ObjectSizeMB = "{0:N2}" -f ($ObjSize.Sum / 1MB)
    $ObjectSizeGB = "{0:N2}" -f ($ObjSize.Sum / 1GB)
    $ObjectSizeTB = "{0:N2}" -f ($ObjSize.Sum / 1TB)

    $NumFiles   = ($GetObjectInfo | Where-Object {-not $_.PSIsContainer}).Count
    $NumFolders = ($GetObjectInfo | Where-Object {$_.PSIsContainer}).Count
    
    $ACL            = Get-Acl -Path $Path
    $LastAccessTime = Get-ItemProperty $Path | Select-Object -ExpandProperty LastAccessTime
    $LastWriteTime  = Get-ItemProperty $Path | Select-Object -ExpandProperty LastWriteTime

    $Table = [PSCustomObject]@{
        'ScopeName'          = $FileShare.ScopeName
        'Sharename'          = $ShareName
        'SharePath'          = $Path
        'Owner'              = $ACL.Owner
        'Permissions'        = $ACL.AccessToString
        'LastAccess'         = $LastAccessTime
        'LastWrite'          = $LastWriteTime
        'Size (MB)'          = $ObjectSizeMB
        'Size (GB)'          = $ObjectSizeGB
        'Size (TB)'          = $ObjectSizeTB
        'Total File Count'   = $NumFiles
        'Total Folder Count' = $NumFolders
        'Total Item Count'   = $GetObjectInfo.Count
    }

    $Table | Export-CSV -Path $SaveCSVPath -Append -NoTypeInformation 
}

$EndTime = Get-Date
$End     = $EndTime | Select-Object -ExpandProperty DateTime

Write-Host
Write-Host "Script start time: $Start" -ForegroundColor Green
Write-Host "Script end time: $End" -ForegroundColor Green

Write-Host
$ElapsedTime = $(($EndTime-$StartTime))
Write-Host "Elapsed time: $($ElapsedTime.Days) Days $($ElapsedTime.Hours) Hours $($ElapsedTime.Minutes) Minutes $($ElapsedTime.Seconds) Seconds $($ElapsedTime.MilliSeconds) Milliseconds" -ForegroundColor Cyan

Write-Host
Write-Host "Results saved to $SaveCSVPath" -ForegroundColor Green

Write-Host
Write-Host "Transcript saved to $TranscriptLog" -ForegroundColor Green

Write-Host
Stop-Transcript

CodePudding user response:

To correctly use the PowerShell pipeline (and preserve memory as each item is streamed separately), use the PowerShell ForEach-Object cmdlet (unlike the ForEach statement) and avoid assigning the pipeline to a variable (as you doing with $FileShares = ...) and don't use parenthesis ((...)) arround the the pipeline:

Get-SmbShare -ScopeName $ScopeName | Where-Object $Exclusions | ForEach-Object {

And replace all $FileShare variables in your loop with the current item: $_ variable (e.g. $FileShare.Name$_.Name).

For the Get-Childitem part you might do the same thing (stream! meaning: use the mighty PowerShell pipeline rather than piling everything up in $GetObjectInfo):

$ObjSize = Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue |
    Measure-Object -Property Length -Sum -ErrorAction SilentlyContinue

As an aside; you might simplify your 3 size properties to a single smarter size property, see: How to convert value to KB, MB, or GB depending on digit placeholders?

CodePudding user response:

You are buffering the entire collection of [FileSystemInfo] on $FileShare into a variable with...

$GetObjectInfo = Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue

So, if there's a million directories and files on that share then that's a million [FileSystemInfo] instances stored in a million-element array, none of which can be garbage collected during that iteration of the foreach loop. You can use Group-Object to improve that a bit...

$groupsByPSIsContainer = Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue |
    Group-Object -Property 'PSIsContainer' -AsHashTable
# $groupsByPSIsContainer is a [Hashtable] with two keys:
#     - $true gets the collection of directories
#     - $false gets the collection of files

$ObjSize = $groupsByPSIsContainer[$false] | Measure-Object -Property Length -Sum -ErrorAction SilentlyContinue

$NumFiles   = $groupsByPSIsContainer[$false].Count
$NumFolders = $groupsByPSIsContainer[$true].Count

...but that still ends up storing all of the [FileSystemInfo]s in the two branches of the [Hashtable]. Instead, I would just enumerate and count the results myself...

$ObjSize    = 0L # Stores the total file size directly; use $ObjSize instead of $ObjSize.Sum
$NumFiles   = 0
$NumFolders = 0

foreach ($fileSystemInfo in Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue)
{
    if ($fileSystemInfo.PSIsContainer)
    {
        $NumFolders  
    }
    else
    {
        $NumFiles  
        $ObjSize  = $fileSystemInfo.Length
    }
}

That stores only the current enumeration result in $fileSystemInfo and never the entire sequence.

Note that if you weren't summing the files' sizes Group-Object would work well...

$groupsByIsContainer = Get-Childitem -Path $Path -Recurse -Force -ErrorAction SilentlyContinue |
    Group-Object -Property 'PSIsContainer' -NoElement

$NumFiles   = ($groupsByIsContainer | Where-Object -Property 'Name' -EQ -Value $false).Count
$NumFolders = ($groupsByIsContainer | Where-Object -Property 'Name' -EQ -Value $true ).Count

-NoElement prevents the resulting group objects from storing the grouped elements; we just care about the count of members in each grouping but not the members themselves. If we passed -AsHashTable then we'd lose the convenient Count property, hence why the two groups have to be accessed in this awkward way.

  • Related