Home > front end >  Is there an easier way to run commands in Windows PowerShell while keeping it efficient?
Is there an easier way to run commands in Windows PowerShell while keeping it efficient?

Time:11-02

This self-answer intends to provide an easy and efficient parallelism alternative for those stuck with Windows PowerShell and being unable to install Modules due to, for example, Company Policies.

In Windows PowerShell, the built-in available alternatives for local parallel invocations are Start-Job and workflow, both known to be very slow, inefficient, and one of them (workflow) is not even recommended to use and no longer available in newer versions of PowerShell.

The other alternative is to rely on the PowerShell SDK and code our own parallel logic using what the System.Management.Automation.Runspaces Namespace has to offer. This is definitively the most efficient approach and is what ForEach-Object -Parallel (in PowerShell Core) as well as the Start-ThreadJob (preinstalled in PowerShell Core and available in Windows PowerShell through the PowerShell Gallery) uses behind the scenes.

A simple example:

$throttlelimit = 3

$pool = [runspacefactory]::CreateRunspacePool(1, $throttlelimit)
$pool.Open()

$tasks = 0..10 | ForEach-Object {
    $ps = [powershell]::Create().AddScript({
        'hello world from {0}' -f [runspace]::DefaultRunspace.InstanceId
        Start-Sleep 3
    })
    $ps.RunspacePool = $pool

    @{ Instance = $ps; AsyncResult = $ps.BeginInvoke() }
}

$tasks | ForEach-Object {
    $_.Instance.EndInvoke($_.AsyncResult)
}

$tasks.Instance, $pool | ForEach-Object Dispose

This is great but gets tedious and often times complicated when the code has more complexity and in consequence brings lots of questions.

Is there an easier way to do it?

CodePudding user response:

Since this is a topic that can be confusing and often brings questions to the site I have decided to create this function that can simplify this tedious task and help those stuck in Windows PowerShell. The aim is to have it as simple and as friendly as possible, it should also be a function that could be copy-pasted in our $PROFILE to be reused whenever needed and not require the installation of a Module (as stated in the question).

This function has been greatly inspired by RamblingCookieMonster's Invoke-Parallel and Boe Prox's PoshRSJob and is merely a simplified take on those with a few improvements.

NOTE

Further updates to this function will be published to the official GitHub repo as well as to the PowerShell Gallery. The code in this answer will no longer be maintained.

Contributions are more than welcome, if you wish to contribute, fork the repo and submit a pull request with the changes.


DEFINITION

using namespace System.Collections.Generic
using namespace System.Management.Automation
using namespace System.Management.Automation.Runspaces
using namespace System.Management.Automation.Language
using namespace System.Text

function Invoke-Parallel {
    [CmdletBinding(PositionalBinding = $false)]
    [Alias('parallel', 'parallelpipeline')]
    param(
        [Parameter(Mandatory, ValueFromPipeline)]
        [object] $InputObject,

        [Parameter(Mandatory, Position = 0)]
        [scriptblock] $ScriptBlock,

        [Parameter()]
        [int] $ThrottleLimit = 5,

        [Parameter()]
        [hashtable] $Variables,

        [Parameter()]
        [ArgumentCompleter({
            param(
                [string] $commandName,
                [string] $parameterName,
                [string] $wordToComplete
            )

            (Get-Command -CommandType Filter, Function).Name -like "$wordToComplete*"
        })]
        [string[]] $Functions,

        [Parameter()]
        [ValidateSet('ReuseThread', 'UseNewThread')]
        [PSThreadOptions] $ThreadOptions = [PSThreadOptions]::ReuseThread
    )

    begin {
        try {
            $iss = [initialsessionstate]::CreateDefault2()

            foreach($key in $Variables.PSBase.Keys) {
                $iss.Variables.Add([SessionStateVariableEntry]::new($key, $Variables[$key], ''))
            }

            foreach($function in $Functions) {
                $def = (Get-Command $function).Definition
                $iss.Commands.Add([SessionStateFunctionEntry]::new($function, $def))
            }

            $usingParams = @{}

            foreach($usingstatement in $ScriptBlock.Ast.FindAll({ $args[0] -is [UsingExpressionAst] }, $true)) {
                $varText = $usingstatement.Extent.Text
                $varPath = $usingstatement.SubExpression.VariablePath.UserPath

                # Credits to mklement0 for catching up a bug here. Thank you!
                # https://github.com/mklement0
                $key = [Convert]::ToBase64String([Encoding]::Unicode.GetBytes($varText))
                if(-not $usingParams.ContainsKey($key)) {
                    $usingParams.Add($key, $ExecutionContext.SessionState.PSVariable.Get($varPath).Value)
                }
            }

            $pool  = [runspacefactory]::CreateRunspacePool(1, $ThrottleLimit, $iss, $Host)
            $tasks = [List[hashtable]]::new()
            $pool.ThreadOptions = $ThreadOptions
            $pool.Open()
        }
        catch {
            $PSCmdlet.ThrowTerminatingError($_)
        }
    }
    process {
        try {
            # Thanks to Patrick Meinecke for his help here.
            # https://github.com/SeeminglyScience/
            $ps = [powershell]::Create().AddScript({
                $args[0].InvokeWithContext($null, [psvariable]::new('_', $args[1]))
            }).AddArgument($ScriptBlock.Ast.GetScriptBlock()).AddArgument($InputObject)

            # This is how `Start-Job` does it's magic. Credits to Jordan Borean for his help here.
            # https://github.com/jborean93

            # Reference in the source code:
            # https://github.com/PowerShell/PowerShell/blob/7dc4587014bfa22919c933607bf564f0ba53db2e/src/System.Management.Automation/engine/ParameterBinderController.cs#L647-L653
            if($usingParams.Count) {
                $null = $ps.AddParameters(@{ '--%' = $usingParams })
            }

            $ps.RunspacePool = $pool

            $tasks.Add(@{
                Instance    = $ps
                AsyncResult = $ps.BeginInvoke()
            })
        }
        catch {
            $PSCmdlet.WriteError($_)
        }
    }
    end {
        try {
            foreach($task in $tasks) {
                $task['Instance'].EndInvoke($task['AsyncResult'])

                if($task['Instance'].HadErrors) {
                    $task['Instance'].Streams.Error
                }
            }
        }
        catch {
            $PSCmdlet.WriteError($_)
        }
        finally {
            $tasks.Instance, $pool | ForEach-Object Dispose
        }
    }
}

SYNTAX

Invoke-Parallel -InputObject <Object> [-ScriptBlock] <ScriptBlock> [-ThrottleLimit <Int32>]
 [-ArgumentList <Hashtable>] [-ThreadOptions <PSThreadOptions>] [-Functions <String[]>] [<CommonParameters>]

REQUIREMENTS

Compatible with Windows PowerShell 5.1 and PowerShell Core 7 .

INSTALLATION

If you wish to install this through the Gallery and have it available as a Module:

Install-Module PSParallelPipeline -Scope CurrentUser

EXAMPLES

EXAMPLE 1: Run slow script in parallel batches

$message = 'Hello world from {0}'

0..10 | Invoke-Parallel {
    $using:message -f [runspace]::DefaultRunspace.InstanceId
    Start-Sleep 3
} -ThrottleLimit 3

EXAMPLE 2: Same as previous example but with -Variables parameter

$message = 'Hello world from {0}'

0..10 | Invoke-Parallel {
    $message -f [runspace]::DefaultRunspace.InstanceId
    Start-Sleep 3
} -Variables @{ message = $message } -ThrottleLimit 3

EXAMPLE 3: Adding to a single thread safe instance

$sync = [hashtable]::Synchronized(@{})

Get-Process | Invoke-Parallel {
    $sync = $using:sync
    $sync[$_.Name]  = @( $_ )
}

$sync

EXAMPLE 4: Same as previous example but using -Variables to pass the reference instance to the Runspaces

This method is the recommended when passing reference instances to the runspaces, $using: may fail in some situations.

$sync = [hashtable]::Synchronized(@{})

Get-Process | Invoke-Parallel {
    $sync[$_.Name]  = @( $_ )
} -Variables @{ sync = $sync }

$sync

EXAMPLE 5: Demonstrates how to pass a locally defined Function to the Runspace scope

function Greet { param($s) "$s hey there!" }

0..10 | Invoke-Parallel {
    Greet $_
} -Functions Greet

PARAMETERS

-InputObject

Specifies the input objects to be processed in the ScriptBlock.
Note: This parameter is intended to be bound from pipeline.

Type: Object
Parameter Sets: (All)
Aliases:

Required: True
Position: Named
Default value: None
Accept pipeline input: True (ByValue)
Accept wildcard characters: False

-ScriptBlock

Specifies the operation that is performed on each input object.
This script block is run for every object in the pipeline.

Type: ScriptBlock
Parameter Sets: (All)
Aliases:

Required: True
Position: 1
Default value: None
Accept pipeline input: False
Accept wildcard characters: False

-ThrottleLimit

Specifies the number of script blocks that are invoked in parallel.
Input objects are blocked until the running script block count falls below the ThrottleLimit.
The default value is 5.

Type: Int32
Parameter Sets: (All)
Aliases:

Required: False
Position: Named
Default value: 5
Accept pipeline input: False
Accept wildcard characters: False

-Variables

Specifies a hash table of variables to have available in the Script Block (Runspaces). The hash table Keys become the Variable Name inside the Script Block.

Type: Hashtable
Parameter Sets: (All)
Aliases:

Required: False
Position: Named
Default value: None
Accept pipeline input: False
Accept wildcard characters: False

-Functions

Existing functions in the Local Session to have available in the Script Block (Runspaces).

Type: String[]
Parameter Sets: (All)
Aliases:

Required: False
Position: Named
Default value: None
Accept pipeline input: False
Accept wildcard characters: False

-ThreadOptions

These options control whether a new thread is created when a command is executed within a Runspace.
This parameter is limited to ReuseThread and UseNewThread. Default value is ReuseThread.
See PSThreadOptions Enum for details.

Type: PSThreadOptions
Parameter Sets: (All)
Aliases:
Accepted values: Default, UseNewThread, ReuseThread, UseCurrentThread

Required: False
Position: Named
Default value: ReuseThread
Accept pipeline input: False
Accept wildcard characters: False
  • Related