I want to work with a CSV file of more than 300,000 lines. I need to verify information line by line and then display it in a .txt file in the form of a table to see which file was missing for all servers. For example
Name,Server
File1,Server1
File2,Server1
File3,Server1
File1,Server2
File2,Server2
...
File345,Server76
File346,Server32
I want to display in table form this result which corresponds to the example above:
Name Server1 Server2 ... Server 32 ....Server 76
File1 X X
File2 X X
File3 X
...
File345 X
File346 X
To do this actually, I have a function that creates objects where the members are the Server Name (The number of members object can change) and I use stream reader to split data (I have more than 2 columns in my csv so 0 is for the Server name and 5 for the file name)
$stream = [System.IO.StreamReader]::new($File)
$stream.ReadLine() | Out-Null
while ((-not $stream.EndOfStream)) {
$line = $stream.ReadLine()
$strTempo = $null
$strTempo = $line -split ","
$index = $listOfFile.Name.IndexOf($strTempo[5])
if ($index -ne -1) {
$property = $strTempo[0].Replace("-", "_")
$listOfFile[$index].$property = "X"
}
else {
$obj = CreateEmptyObject ($listOfConfiguration)
$obj.Name = $strTempo[5]
$listOfFile.Add($obj) | Out-Null
}
}
When I export this I have a pretty good result. But the script take so much time (between 20min to 1hour)
I didn't know how optimize actually the script. I'm beginner to PowerShell.
Thanks for the futures tips
CodePudding user response:
You might use HashSet
s for this:
$Servers = [System.Collections.Generic.HashSet[String]]::New()
$Files = @{}
Import-Csv -Path $Path |ForEach-Object {
$Null = $Servers.Add($_.Server)
if ($Files.Contains($_.Name)) { $Null = $Files[$_.Name].Add($_.Server) }
else { $Files[$_.Name] = [System.Collections.Generic.HashSet[String]]$_.Server }
}
$Table = foreach($Name in $Files.get_Keys()) {
$Properties = [Ordered]@{ Name = $Name }
ForEach ($Server in $Servers) {
$Properties[$Server] = if ($Files[$Name].Contains($Server)) { 'X' }
}
[PSCustomObject]$Properties
}
$Table |Format-Table -Property @{ expression='*' }
Note that in contrast to PowerShell's usual behavior, the .Net HashSet
class is case-sensitive by default. To create an case-insensitive HashSet
use the following constructor:
[System.Collections.Generic.HashSet[String]]::New([StringComparer]::OrdinalIgnoreCase)
CodePudding user response:
See if this works faster. Change filename as required
$Path = "C:\temp\test1.txt"
$table = Import-Csv -Path $Path
$columnNames = $table | Select-Object -Property Server -Unique| foreach{$_.Server} | Sort-Object
Write-Host "names = " $columnNames
$groups = $table | Group-Object {$_.Name}
$outputTable = [System.Collections.ArrayList]@()
foreach($group in $groups)
{
Write-Host "Group = " $group.Name
$newRow = New-Object -TypeName psobject
$newRow | Add-Member -NotePropertyName Name -NotePropertyValue $group.Name
$servers = $group.Group | Select-Object -Property Server | foreach{$_.Server}
Write-Host "servers = " $servers
foreach($item in $columnNames)
{
if($servers.Contains($item))
{
$newRow | Add-Member -NotePropertyName $item -NotePropertyValue 'X'
}
else
{
#if you comment out next line code doesn't work
$newRow | Add-Member -NotePropertyName $item -NotePropertyValue ''
}
}
$outputTable.Add($newRow) | Out-Null
}
$outputTable | Format-Table