Home > Software engineering >  Powershell foreach read write slow
Powershell foreach read write slow

Time:10-28

I have a few hundred files that are around 1.5 MB each. I need to run the files against the below loop, but it is very slow. Each file takes about 5 minutes to loop through. Is there a faster way?

function Convert-File($inputFile,$outputFile,$dataDate)
{
if ([string]::IsNullOrEmpty($dataDate)) 
{
$dataDate = $inputFile.split('.') | select -last 1
}
Write-Host "File data date is $dataDate"
#Get-Content $inputFile | Select-String -pattern $dataDate | Out-File $outputFile
$header=""
$headerOut=$false
if (Test-Path $outputFile) 
{
  Remove-Item $outputFile
}
foreach($line in [System.IO.File]::ReadLines($inputFile))
{
    if ($line.StartsWith("!"))
    {
        $header=$line
        continue
    }
    if ($line.Contains($dataDate))
    {
        if (!$headerOut) 
        {
        $headerOut=$true
        #Write-Host $header
        Set-Content -Path $outputFile -Value $header.substring(1).Replace('|',',') -Force
        }
        if ([string]::IsNullOrEmpty($line)) { continue }
        #Write-Host $line
        Add-Content $outputFile $line.Replace('|',',') -force
    }
}
}

The code works but I would like the code to perform faster. Any suggestions?

CodePudding user response:

Add-Content is the bottleneck in your code, opening and closing a FileStream on each loop iteration is very expensive. This operation should be done only once.

Also, worth noting [string]::IsNullOrEmpty( ) should be the first condition of your loop and, most likely you want to use [string]::IsNullOrWhiteSpace( ) instead, though I'll leave that up to you to decide.

This is how your final loop should loop using a StreamWriter:

try {
    foreach($line in [System.IO.File]::ReadLines($inputFile)) {
        if ([string]::IsNullOrEmpty($line)) {
            continue
        }
        if ($line.StartsWith('!')) {
            $header = $line
            continue
        }
        if ($line.Contains($dataDate)) {
            if (-not $headerOut) {
                $headerOut = $true

                $fs     = (New-Item $outputFile -Force).OpenWrite()
                $writer = [System.IO.StreamWriter] $fs
                $writer.WriteLine($header.SubString(1).Replace('|', ','))
            }

            $writer.WriteLine($line.Replace('|', ','))
        }
    }
}
finally {
    $writer, $fs | ForEach-Object Dispose
}
  • Related