Home > Net >  How do I optimise performance when selecting first 2 rows per group from Excel spreadsheet in PowerS
How do I optimise performance when selecting first 2 rows per group from Excel spreadsheet in PowerS

Time:03-17

I have a requirement to select the first X rows for each unique agent ID. My method below works, but it runs into performance issues when the spreadsheet has over about 5k results to consider. I am hopeful that you very smart people can help me optimise the approach to require less processing.

I am using ImportExcel to import the spreadsheet of call records, then I filter out uninteresting rows and I'm left with $UsableCalls as my pool of calls to be evaluated. Sometimes, this pool has only 2k rows. Sometimes it has 16k. It's possible that it might have even more. Unfortunately, it seems like the max this method can support is around 5k-ish results. Anything over that and the process hangs. So if I have 5k rows, then I can really only handle getting the first 1 call per agent. If I have 2k, then I can get the first 2 calls per agent. The number of calls per agent is selectable, and I'd like to have the option to get up to the first 5 calls per agent, but that simply won't work with the way it processes right now.

Ultimately, the goal is to select the first X# calls (rows) for each agent. I then export that as a second spreadsheet. This is the only method I could come up with, but I am certainly open to suggestion.

Here is what I have, how can I improve it?

# this custom function allows the user to select a digit, uses wpf messagebox
$numberagent = select-numberperagent  

# collect the unique agent IDs
$UniqueAgentIDs = @()
$UniqueAgentIDs  = ($UsableCalls | Select-Object -Property "Agent ID" -Unique )."Agent ID"

# select first X# of each agent's calls
$CallsPerAgent = @()
foreach ($UniqueAgent in $UniqueAgentIDs) {
    $CallsPerAgent  = ($UsableCalls | ? {$_."Agent ID" -eq "$UniqueAgent"}) | select -First $numberagent
    } #close foreach uniqueagent

And here is an example of one of the custom PS Objects in the variable $usableCalls:

PS C:\> $usableCalls[0]

DateTime    : 2022-03-03 11:06:16.063
DigitDialed : 781
Agent ID    : 261
Agent Name  : CCM
Skill group : PAYE.
CallType    : PAYE
PPSN        : 81
DNIS        : 10
ANI         : 772606677789
Disposition : Handled
Duration    : 818
RingTime    : 12
DelayTime   : 0
HoldTime    : 0
TalkTime    : 14
WorkTime    : 31

CodePudding user response:

The first thing to improve the speed is to not use = to add stuff to an array.
By doing so, on each addition, the entire array needs to be rebuilt in memory. Better let PowerShell do the collecting of data for you:

# collect the unique agent IDs
$UniqueAgentIDs = ($UsableCalls | Select-Object -Property 'Agent ID' -Unique).'Agent ID'

# select first X# of each agent's calls
$CallsPerAgent = foreach ($UniqueAgent in $UniqueAgentIDs) {
    $UsableCalls | Where-Object {$_.'Agent ID' -eq $UniqueAgent} | Select-Object -First $numberagent
}

Without really knowing what objects are in your variable $UsableCalls, you might even be better off using Group-Object to group all calls in the Agent's ID and loop over these groups

$CallsPerAgent = $UsableCalls | Group-Object -Property 'Agent ID' | ForEach-Object {
    $_.Group | Select-Object -First $numberagent
}
  • Related