Home > Mobile >  PowerShell Group-Object no longer splitting objects into fixed-size collections
PowerShell Group-Object no longer splitting objects into fixed-size collections

Time:04-27

I had this in a script which I believe was working well, but appears to have stopped working:

$testList = @("object 1","object 2","object 3","object 4","object 5")
$counter = 0
$maxSize = 2
$groupedList = $testList | Group-Object { [math]::Floor($counter   / $maxSize) }
$groupedList
$groupedList | Measure-Object

Previously Measure-Object would have given me a Count of 3, but now I receive a count of 1.

Why is this no longer working? Is the counter integer not being incremented within the Group-Object command anymore?

CodePudding user response:

I don't think your command ever worked as intended (see the bottom section for details).

Replace:

{ [math]::Floor($counter   / $maxSize) }

with:

{ [math]::Floor((Get-Variable -Scope 1 counter).Value   / $maxSize) }

in order to ensure that it is the caller's $counter variable that is updated across invocations of the script block.

Note:

  • $script:counter works too, but only if the calling scope is a script's top-level scope.

  • Get-Variable -Scope 1 counter explicitly targets the parent scope's definition of $counter, which is the caller's; for brevity you could omit -Scope 1, because by default a variable in the closest ancestral scope is returned.

  • Santiago Squarzon suggests another option:

    • You can define your $counter variable as a reference-type object with a .Value property rather than as an integer, in the simplest form as a hashtable:

      $counter = @{ Value = 0 }
      
    • By virtue of $counter now containing an object reference, the child scope in the script block sees the very same object when querying the value of the parent scope's $counter variable, and can update its .Value property:

      { [math]::Floor($counter.Value   / $maxSize) }
      

The problem is that the script block passed to the (positionally) implied -Property parameter of Group-Object runs in a child scope, so that $counter implicitly creates a block-local variable on every invocation:

  • That is, $counter does not act on the existing variable in the parent scope; it implicitly creates a scope-local copy of the $counter variable (with the caller's current value), and is applied to it.

    • This perhaps surprising behavior - a local variable implicitly getting created when (ostensibly) assigning to a variable that exists in an ancestral scope - is discussed in detail in this answer.

    • On a related note, it may be surprising that such a script block (which includes calculated properties in general) runs in a child scope rather than directly in the caller's to begin with - see GitHub issue #7157

  • Since your is a post-increment, it is the original 0 value (copied from the parent-scope variable) that acts as the LHS operand to the division operation /.

  • On exiting the script block, the local $counter variable goes out of scope, and the next invocation starts from scratch.

Thus, in effect, the script block's return value is a fixed 0, so you'll only ever get one group.


Alternatives to using Group-Object for "chunking":

Your use of Group-Object to achieve "chunking" (batching, partitioning, i.e. splitting the input into arrays of fixed size) is elegant, but ultimately inefficient, and it isn't a streaming solution, because all input must be collected up front (neither aspect may matter in a given situation).

  • It would be great if Select-Object directly supported such a feature, which is what GitHub issue #8270 proposes, in the form of a -ReadCount parameter:

    # WISHFUL THINKING
    PS> 1..5 | Select-Object -ReadCount 2 | ForEach-Object { "$_" }
    1 2
    3 4
    5
    
  • This answer provides a custom function, Select-Chunk, which implements this functionality:

    # Assumes function Select-Chunk from the linked answer is defined.
    PS> 1..5 | Select-Chunk -ReadCount 2 | ForEach-Object { "$_" }
    1 2
    3 4
    5
    
  • Related