Home > Software design >  Powershell using Foreach-ObjectFast and Where-ObjectFast
Powershell using Foreach-ObjectFast and Where-ObjectFast

Time:10-22

I never really worked with Powershell, so I am quite stuck with this. My goal is to merge multiple CSV's into 1, more specifically 3 at the moment.

Using Import-Csv and Foreach-Object I managed to achieve this, however super-incredibly slow. I have discovered this article so I gave it a try. Incredible fast iteration.

Unfortunately I am too dumb with Powershell to understand why I cannot use Where-ObjectFast properly, it won't match anything.

My code example:

Measure-Command {   $CSV1 = Import-CSV -Path .\CSV1.csv }

Measure-Command {   $CSV2 = Import-CSV -Path .\CSV2.csv }

Measure-Command {

Import-Csv -Path .\CSV3.csv | Foreach-ObjectFast {
            $row = $_;
            $match = $CSV1 | Where-ObjectFast -FilterScript {       $_.name -eq $row.'name'     }
            $dbg1 = 'Matched: {0}' -f $match;   Write-Host $dbg1 -foreground Cyan;

   # continued... } }

What I need to do basically is to match "name" from CVS3 with "name" from CSV1, and other fields as needed, then do the same for CSV2 and output to a final file.

It seems that when using Where-ObjectFast $_ is empty (?).

Please advise what I am doing wrong here, I would really appreciate it.

CodePudding user response:

The problem you're having is not with the overhead from ForEach-Object binding and processing the input - so replacing ForEach-Object with ForEach-ObjectFast is not going to have a significant impact.

If you want to pivot on the name column (or any other column), build index tables with a hashtable/dictionary:

$CSV1 = @{}
Import-Csv -Path .\CSV1.csv |ForEach-Object { $CSV1[$_.name] = $_ }

$CSV2 = @{}
Import-Csv -Path .\CSV2.csv |ForEach-Object { $CSV2[$_.id] = $_ }

Now you don't need to wait for Where-Object to search through each collection:

Import-Csv -Path .\CSV3.csv |ForEach-Object {
  $row = $_
  # This is going to be MUCH faster than ... |Where-Object { ... }
  $csv1match = $CSV1[$row.name]
  $csv2match = $CSV1[$row.id]

  # join $row,$csv1match,$csv2match here
}
  • Related