Please reference earlier the question : How to get replace by id (Id value get from XML) and put into CSV, powershell
I need add ForEach-Object
instead of ForEach
, I have 11,000 rows. So, it takes too much time.
foreach ($csvRow in $csvRows) {
foreach ($columnName in $columnNames) {
$id = $csvRow.$columnName
if (-not $id) { continue }
$newId = @($xmlDoc.enum_types.enum_type.Where( { $_.field_name -eq $columnName }, 'First').
items).ForEach( { $_.item }).Where( { $_.id -eq $id }).value
$csvRow.$columnName = $newId
}
}
$csvRows | Export-Csv -NoTypeInformation -Encoding utf8 $CSVpath
CodePudding user response:
One thing you could do is pre-compute your lookups from the "enums" xml in your other question.
Original code - 7649 milliseconds
# setup
$xml = @"
<enum_types>
<enum_type field_name="Test1">
<items>
<item>
<id>1</id>
<value>A</value>
</item>
</items>
</enum_type>
<enum_type field_name="Test2">
<items>
<item>
<id>1</id>
<value>A</value>
</item>
</items>
</enum_type>
</enum_types>
"@;
$enums = [xml] $xml;
$csv = @("Test1, Test2");
$csv = 1..110000 | % { "1, 1" }
$data = $csv | ConvertFrom-Csv
$columnNames = @( "Test1", "Test2" );
# perf test - 7649 milliseconds
measure-command -expression {
foreach ($csvRow in $data)
{
foreach ($columnName in $columnNames)
{
$id = $csvRow.$columnName
if ( -not $id )
{
continue;
}
$newId = @($enums.enum_types.enum_type.Where( { $_.field_name -eq $columnName }, 'First').
items).ForEach( { $_.item }).Where( { $_.id -eq $id }).value
$csvRow.$columnName = $newId
}
}
}
With pre-computed lookups - 792 milliseconds
# setup
... as above ...
# pre-compute lookups
$lookups = @{};
foreach( $enum_type in $enums.enum_types.enum_type )
{
$lookups[$enum_type.field_name] = @{}
foreach( $item in $enum_type.items.item )
{
$lookups[$enum_type.field_name][$item.id] = $item.value
}
}
write-host ($lookups | convertto-json -depth 99)
# {
# "Test1": {
# "1": "A"
# },
# "Test2": {
# "1": "A"
# }
# }
# perf test - 792 milliseconds
measure-command -expression {
foreach ($csvRow in $data)
{
foreach ($columnName in $columnNames)
{
$id = $csvRow.$columnName
if (-not $id)
{
continue;
}
$csvRow.$columnName = $lookups[$columnName][$id]
}
}
}
There's possibly more you could squeeze out of this approach, but it's already a 10x speedup (in my very limited benchmarks).
CodePudding user response:
Preload all the new items into hash tables:
$Types = $xmlDoc.enum_types.enum_type
$HashTable = @{}
Import-Csv .\1.Csv |ForEach-Object { $Names = $Null } {
if (!$Names) {
$Names = $_.psobject.Properties.Name
foreach ($Name in $Names) {
$HashTable[$Name] = @{}
foreach ($Field in $Types.Where{$_.field_name -eq $Name}) {
foreach ($Item in $Field.Items.Item) {
$HashTable[$Name][$Item.Id] = $Item.Value
}
}
}
}
foreach ($Name in $Names) {
$id = $_.$Name
If ($HashTable[$Name].Contains($Id)) {
$_.$Name = $HashTable[$Name][$Id]
}
}
$_
} |Export-Csv .\1New.Csv -NoTypeInformation -Encoding utf8