I have a few hundred files that are around 1.5 MB each. I need to run the files against the below loop, but it is very slow. Each file takes about 5 minutes to loop through. Is there a faster way?
function Convert-File($inputFile,$outputFile,$dataDate)
{
if ([string]::IsNullOrEmpty($dataDate))
{
$dataDate = $inputFile.split('.') | select -last 1
}
Write-Host "File data date is $dataDate"
#Get-Content $inputFile | Select-String -pattern $dataDate | Out-File $outputFile
$header=""
$headerOut=$false
if (Test-Path $outputFile)
{
Remove-Item $outputFile
}
foreach($line in [System.IO.File]::ReadLines($inputFile))
{
if ($line.StartsWith("!"))
{
$header=$line
continue
}
if ($line.Contains($dataDate))
{
if (!$headerOut)
{
$headerOut=$true
#Write-Host $header
Set-Content -Path $outputFile -Value $header.substring(1).Replace('|',',') -Force
}
if ([string]::IsNullOrEmpty($line)) { continue }
#Write-Host $line
Add-Content $outputFile $line.Replace('|',',') -force
}
}
}
The code works but I would like the code to perform faster. Any suggestions?
CodePudding user response:
Add-Content
is the bottleneck in your code, opening and closing a FileStream
on each loop iteration is very expensive. This operation should be done only once.
Also, worth noting [string]::IsNullOrEmpty( )
should be the first condition of your loop and, most likely you want to use [string]::IsNullOrWhiteSpace( )
instead, though I'll leave that up to you to decide.
This is how your final loop should loop using a StreamWriter
:
try {
foreach($line in [System.IO.File]::ReadLines($inputFile)) {
if ([string]::IsNullOrEmpty($line)) {
continue
}
if ($line.StartsWith('!')) {
$header = $line
continue
}
if ($line.Contains($dataDate)) {
if (-not $headerOut) {
$headerOut = $true
$fs = (New-Item $outputFile -Force).OpenWrite()
$writer = [System.IO.StreamWriter] $fs
$writer.WriteLine($header.SubString(1).Replace('|', ','))
}
$writer.WriteLine($line.Replace('|', ','))
}
}
}
finally {
$writer, $fs | ForEach-Object Dispose
}