This self-answer intends to provide an easy and efficient parallelism alternative for those stuck with Windows PowerShell and being unable to install Modules due to, for example, Company Policies.
In Windows PowerShell, the built-in available alternatives for local parallel invocations are Start-Job
and workflow
, both known to be very slow, inefficient, and one of them (workflow
) is not even recommended to use and no longer available in newer versions of PowerShell.
The other alternative is to rely on the PowerShell SDK and code our own parallel logic using what the System.Management.Automation.Runspaces
Namespace has to offer. This is definitively the most efficient approach and is what ForEach-Object -Parallel
(in PowerShell Core) as well as the Start-ThreadJob
(preinstalled in PowerShell Core and available in Windows PowerShell through the PowerShell Gallery) uses behind the scenes.
A simple example:
$throttlelimit = 3
$pool = [runspacefactory]::CreateRunspacePool(1, $throttlelimit)
$pool.Open()
$tasks = 0..10 | ForEach-Object {
$ps = [powershell]::Create().AddScript({
'hello world from {0}' -f [runspace]::DefaultRunspace.InstanceId
Start-Sleep 3
})
$ps.RunspacePool = $pool
@{ Instance = $ps; AsyncResult = $ps.BeginInvoke() }
}
$tasks | ForEach-Object {
$_.Instance.EndInvoke($_.AsyncResult)
}
$tasks.Instance, $pool | ForEach-Object Dispose
This is great but gets tedious and often times complicated when the code has more complexity and in consequence brings lots of questions.
Is there an easier way to do it?
CodePudding user response:
Since this is a topic that can be confusing and often brings questions to the site I have decided to create this function that can simplify this tedious task and help those stuck in Windows PowerShell. The aim is to have it as simple and as friendly as possible, it should also be a function that could be copy-pasted in our $PROFILE
to be reused whenever needed and not require the installation of a Module (as stated in the question).
This function has been greatly inspired by RamblingCookieMonster's Invoke-Parallel
and Boe Prox's PoshRSJob
and is merely a simplified take on those with a few improvements.
NOTE
Further updates to this function will be published to the official GitHub repo as well as to the PowerShell Gallery. The code in this answer will no longer be maintained.
Contributions are more than welcome, if you wish to contribute, fork the repo and submit a pull request with the changes.
DEFINITION
using namespace System.Collections.Generic
using namespace System.Management.Automation
using namespace System.Management.Automation.Runspaces
using namespace System.Management.Automation.Language
using namespace System.Text
function Invoke-Parallel {
[CmdletBinding(PositionalBinding = $false)]
[Alias('parallel', 'parallelpipeline')]
param(
[Parameter(Mandatory, ValueFromPipeline)]
[object] $InputObject,
[Parameter(Mandatory, Position = 0)]
[scriptblock] $ScriptBlock,
[Parameter()]
[int] $ThrottleLimit = 5,
[Parameter()]
[hashtable] $Variables,
[Parameter()]
[ArgumentCompleter({
param(
[string] $commandName,
[string] $parameterName,
[string] $wordToComplete
)
(Get-Command -CommandType Filter, Function).Name -like "$wordToComplete*"
})]
[string[]] $Functions,
[Parameter()]
[ValidateSet('ReuseThread', 'UseNewThread')]
[PSThreadOptions] $ThreadOptions = [PSThreadOptions]::ReuseThread
)
begin {
try {
$iss = [initialsessionstate]::CreateDefault2()
foreach($key in $Variables.PSBase.Keys) {
$iss.Variables.Add([SessionStateVariableEntry]::new($key, $Variables[$key], ''))
}
foreach($function in $Functions) {
$def = (Get-Command $function).Definition
$iss.Commands.Add([SessionStateFunctionEntry]::new($function, $def))
}
$usingParams = @{}
foreach($usingstatement in $ScriptBlock.Ast.FindAll({ $args[0] -is [UsingExpressionAst] }, $true)) {
$varText = $usingstatement.Extent.Text
$varPath = $usingstatement.SubExpression.VariablePath.UserPath
# Credits to mklement0 for catching up a bug here. Thank you!
# https://github.com/mklement0
$key = [Convert]::ToBase64String([Encoding]::Unicode.GetBytes($varText))
if(-not $usingParams.ContainsKey($key)) {
$usingParams.Add($key, $ExecutionContext.SessionState.PSVariable.Get($varPath).Value)
}
}
$pool = [runspacefactory]::CreateRunspacePool(1, $ThrottleLimit, $iss, $Host)
$tasks = [List[hashtable]]::new()
$pool.ThreadOptions = $ThreadOptions
$pool.Open()
}
catch {
$PSCmdlet.ThrowTerminatingError($_)
}
}
process {
try {
# Thanks to Patrick Meinecke for his help here.
# https://github.com/SeeminglyScience/
$ps = [powershell]::Create().AddScript({
$args[0].InvokeWithContext($null, [psvariable]::new('_', $args[1]))
}).AddArgument($ScriptBlock.Ast.GetScriptBlock()).AddArgument($InputObject)
# This is how `Start-Job` does it's magic. Credits to Jordan Borean for his help here.
# https://github.com/jborean93
# Reference in the source code:
# https://github.com/PowerShell/PowerShell/blob/7dc4587014bfa22919c933607bf564f0ba53db2e/src/System.Management.Automation/engine/ParameterBinderController.cs#L647-L653
if($usingParams.Count) {
$null = $ps.AddParameters(@{ '--%' = $usingParams })
}
$ps.RunspacePool = $pool
$tasks.Add(@{
Instance = $ps
AsyncResult = $ps.BeginInvoke()
})
}
catch {
$PSCmdlet.WriteError($_)
}
}
end {
try {
foreach($task in $tasks) {
$task['Instance'].EndInvoke($task['AsyncResult'])
if($task['Instance'].HadErrors) {
$task['Instance'].Streams.Error
}
}
}
catch {
$PSCmdlet.WriteError($_)
}
finally {
$tasks.Instance, $pool | ForEach-Object Dispose
}
}
}
SYNTAX
Invoke-Parallel -InputObject <Object> [-ScriptBlock] <ScriptBlock> [-ThrottleLimit <Int32>]
[-ArgumentList <Hashtable>] [-ThreadOptions <PSThreadOptions>] [-Functions <String[]>] [<CommonParameters>]
REQUIREMENTS
Compatible with Windows PowerShell 5.1 and PowerShell Core 7 .
INSTALLATION
If you wish to install this through the Gallery and have it available as a Module:
Install-Module PSParallelPipeline -Scope CurrentUser
EXAMPLES
EXAMPLE 1: Run slow script in parallel batches
$message = 'Hello world from {0}'
0..10 | Invoke-Parallel {
$using:message -f [runspace]::DefaultRunspace.InstanceId
Start-Sleep 3
} -ThrottleLimit 3
EXAMPLE 2: Same as previous example but with -Variables
parameter
$message = 'Hello world from {0}'
0..10 | Invoke-Parallel {
$message -f [runspace]::DefaultRunspace.InstanceId
Start-Sleep 3
} -Variables @{ message = $message } -ThrottleLimit 3
EXAMPLE 3: Adding to a single thread safe instance
$sync = [hashtable]::Synchronized(@{})
Get-Process | Invoke-Parallel {
$sync = $using:sync
$sync[$_.Name] = @( $_ )
}
$sync
EXAMPLE 4: Same as previous example but using -Variables
to pass the reference instance to the Runspaces
This method is the recommended when passing reference instances to the runspaces, $using:
may fail in some situations.
$sync = [hashtable]::Synchronized(@{})
Get-Process | Invoke-Parallel {
$sync[$_.Name] = @( $_ )
} -Variables @{ sync = $sync }
$sync
EXAMPLE 5: Demonstrates how to pass a locally defined Function to the Runspace scope
function Greet { param($s) "$s hey there!" }
0..10 | Invoke-Parallel {
Greet $_
} -Functions Greet
PARAMETERS
-InputObject
Specifies the input objects to be processed in the ScriptBlock.
Note: This parameter is intended to be bound from pipeline.
Type: Object
Parameter Sets: (All)
Aliases:
Required: True
Position: Named
Default value: None
Accept pipeline input: True (ByValue)
Accept wildcard characters: False
-ScriptBlock
Specifies the operation that is performed on each input object.
This script block is run for every object in the pipeline.
Type: ScriptBlock
Parameter Sets: (All)
Aliases:
Required: True
Position: 1
Default value: None
Accept pipeline input: False
Accept wildcard characters: False
-ThrottleLimit
Specifies the number of script blocks that are invoked in parallel.
Input objects are blocked until the running script block count falls below the ThrottleLimit.
The default value is 5
.
Type: Int32
Parameter Sets: (All)
Aliases:
Required: False
Position: Named
Default value: 5
Accept pipeline input: False
Accept wildcard characters: False
-Variables
Specifies a hash table of variables to have available in the Script Block (Runspaces). The hash table Keys become the Variable Name inside the Script Block.
Type: Hashtable
Parameter Sets: (All)
Aliases:
Required: False
Position: Named
Default value: None
Accept pipeline input: False
Accept wildcard characters: False
-Functions
Existing functions in the Local Session to have available in the Script Block (Runspaces).
Type: String[]
Parameter Sets: (All)
Aliases:
Required: False
Position: Named
Default value: None
Accept pipeline input: False
Accept wildcard characters: False
-ThreadOptions
These options control whether a new thread is created when a command is executed within a Runspace.
This parameter is limited to ReuseThread
and UseNewThread
. Default value is ReuseThread
.
See PSThreadOptions
Enum for details.
Type: PSThreadOptions
Parameter Sets: (All)
Aliases:
Accepted values: Default, UseNewThread, ReuseThread, UseCurrentThread
Required: False
Position: Named
Default value: ReuseThread
Accept pipeline input: False
Accept wildcard characters: False