Home > OS >  Powershell - Group and count unique values from CSV file based on a column
Powershell - Group and count unique values from CSV file based on a column

Time:04-06

I am trying to get total completed and failed job for each days. If a job failed for any specific VM (Name field) for any specific day, it will retry the operation.If it complete in second attempt, I want o ignore the failed count for that and reduce the total count accordingly

Example:

My code

$csv =ConvertFrom-Csv @"
Name,Start_time,Status
vm1,20-03-2022,Completed
vm2,20-03-2022,Completed

vm1,21-03-2022,Failed
vm1,21-03-2022,Completed
vm2,21-03-2022,Completed

vm1,22-03-2022,Completed
vm2,22-03-2022,Failed
vm2,22-03-2022,Failed
"@

$Results = @()
foreach ($group in $csv | Group Start_time)
{
    $Completed = ($group.Group | group status | ? Name -eq Completed).Count
    $Failed = ($group.Group | group status | ? Name -eq Failed).Count
    $row = "" | Select Date,Total,Completed,Failed,"Success %"
    $row.Date = $group.Name
    $row.Total = $group.Count
    $row.Completed = $Completed
    $row.Failed = $Failed
    $row."Success %" =  [math]::Round($Completed / $row.Total * 100,2)
    $Results  = $row
}

Above code will give me output as :

Date            Total       Completed       Failed          Success %
20-03-2022      2               2             0               100
21-03-2022      3               2             1               66.67
22-03-2022      3               1             2               33.33

But I am looking for the unique value for each VM for each day and ignore the failure, any retry shows as completed

Date            Total       Completed       Failed          Success %
20-03-2022      2               2             0               100      -> Job completed for vm1 and vm2
21-03-2022      2               2             0               100      -> job failed for vm1 first, but in second try it completed. same day 2 entries for vm1(Failed and Completed. Ignore failure and take only completed)
22-03-2022      2               1             1               50       -> vm2 failed on both attempt. so it has to take as 1 entry. ignore the duplicate run.

CodePudding user response:

This seems to work, needless to say, you're displaying the information in an quite unorthodox way. I believe the code you currently have is how the information should be displayed.

Using this CSV for demonstration:

$csv = ConvertFrom-Csv @"
Name,Start_time,Status
vm1,20-03-2022,Completed
vm2,20-03-2022,Completed
vm1,21-03-2022,Failed
vm1,21-03-2022,Completed
vm2,21-03-2022,Completed
vm1,22-03-2022,Completed
vm2,22-03-2022,Failed
vm2,22-03-2022,Failed
vm1,23-03-2022,Failed
vm1,23-03-2022,Failed
vm2,23-03-2022,Failed
vm2,23-03-2022,Failed
"@

Code:

$csv | Group-Object Start_Time | ForEach-Object {
    $completed = 0; $failed = 0
    $thisGroup = $_.Group | Group-Object Name
    foreach($group in $thisGroup) {
        if('Completed' -in $group.Group.Status) {
            $completed  
            continue
        }
        $failed  
    }
    $total = $completed   $failed

    [pscustomobject]@{
        Start_Date = $_.Name
        Total      = $total
        Completed  = $completed
        Failed     = $failed
        Success    = ($completed / $total).ToString('P0')
    }
} | Format-Table

Result:

Start_Date Total Completed Failed Success
---------- ----- --------- ------ -------
20-03-2022     2         2      0 100%
21-03-2022     2         2      0 100%
22-03-2022     2         1      1 50%
23-03-2022     2         0      2 0%
  • Related