As with all things, the longer you play with something the more you learn about it. It has been nearly 5 years since I wrote my original article on multithreading where I used PowerShell jobs to run multiple items at a time. In the conversations following that post I had a reader submit a fairly nasty note saying that multithreading with jobs was in fact not real multithreading. He also selected some choice words for me making sure I couldn’t post his comments. It hurt my very fragile feelings and, as a result, I started looking at how I could really, truly multithread with PowerShell. The result of that effort is this script. This is a true multithreading script no “ifs”, “ands” or “buts”.
It runs MUCH faster than my other rendition and includes a much more advanced feature set from my other script. The down side is that the script is very difficult to understand if you are new to PowerShell. If you would like to have a script that is more visible and easier to understand, please refer to the version that uses Jobs.
Okay, so the big addition for this script is the ability to either run a script or a cmdlet that’s built in. As well, you can run this within the pipeline! To do that I had to include the begin, process and end blocks. It makes the script a bit more complex, but really pays off when you pipe your custom script into Out-Gridview or pipe your advanced filtering script into a multithreaded one! I have even pulled my SCCM collections via Get-WmiObject and piped them into a multithreaded script! Cool stuff!
Okay, so on to the breakdown.
First we need to get all of our parameters. If this doesn’t make since, please find my post on parameters!
1 2 3 4 5 6 7 8 9 |
Param($Command = $(Read-Host "Enter the script file"), [Parameter(ValueFromPipeline=$true,ValueFromPipelineByPropertyName=$true)]$ObjectList, $InputParam = $Null, $MaxThreads = 20, $SleepTimer = 200, $MaxResultTime = 120, [HashTable]$AddParam = @{}, [Array]$AddSwitch = @() ) |
All of these are defined as follows:
Command
This is where you provide the powershell Commandlet / Script file that you want to multithread. You can also choose a built in cmdlet. Keep in mind that your script. This script is read into a scriptblock, so any unforeseen errors are likely caused by the conversion to a script block.
ObjectList
The objectlist represents the arguments that are provided to the child script. This is an open ended argument and can take a single object from the pipeline, an array, a collection, or a file name. The multithreading script does it’s best to find out which you have provided and handle it as such. If you would like to provide a file, then the file is read with one object on each line and will be provided as is to the script you are running as a string. If this is not desired, then use an array.
InputParam
This allows you to specify the parameter for which your input objects are to be evaluated. As an example, if you were to provide a computer name to the Get-Process cmdlet as just an argument, it would attempt to find all processes where the name was the provided computer name and fail. You need to specify that the parameter that you are providing is the “ComputerName”.
AddParam
This allows you to specify additional parameters to the running command. For instance, if you are trying to find the status of the “BITS” service on all servers in your list, you will need to specify the “Name” parameter. This command takes a hash pair formatted as follows:
1 2 |
@{"ParameterName" = "Value"} @{"ParameterName" = "Value" ; "ParameterTwo" = "Value2"} |
AddSwitch
This allows you to add additional switches to the command you are running. For instance, you may want to include “RequiredServices” to the “Get-Service” cmdlet. This parameter will take a single string, or an aray of strings as follows:
1 2 |
"RequiredServices" @("RequiredServices", "DependentServices") |
MaxThreads
This is the maximum number of threads to run at any given time. If resources are too congested try lowering this number. The default value is 20.
SleepTimer
This is the time between cycles of the child process detection cycle. The default value is 200ms. If CPU utilization is high then you can consider increasing this delay. If the child script takes a long time to run, then you might increase this value to around 1000 (or 1 second in the detection cycle).
Now we need to set everything up. This stuff needs to execute outside of the pipeline, so we place it in the “begin” block of the script.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Begin{ $ISS = [system.management.automation.runspaces.initialsessionstate]::CreateDefault() $RunspacePool = [runspacefactory]::CreateRunspacePool(1, $MaxThreads, $ISS, $Host) $RunspacePool.Open() If ($(Get-Command | Select-Object Name) -match $Command){ $Code = $Null }Else{ $OFS = "`r`n" $Code = [ScriptBlock]::Create($(Get-Content $Command)) Remove-Variable OFS } $Jobs = @() } |
Okay, so to break this down, first we need to make our ISS, or initial session state. This is basically the session state to be used when we open our Runspace. Next we create our Runspaces. The RunspacePool is really what’s going to do the multithreading. It will handle starting our threads and continuously start new ones as required. It is the operating environment for our command pipeline. Finally we open the RunSpacePool. Note that “.Open()” opens the RunSpacePool synchronously, creating a Windows PowerShell execution environment.
Now I am running a detection on what the user provided for the $command parameter. First I will look at all of the currently loaded cmdlets. If it is one of those, then we continue. Otherwise we assume it is a script file. If it is a script file then we need to read the file in to a script block that we can pass to our future threads. To do this we need to change the default $OFS (Object Field Separator), for more understanding here, please read my other post!
Okay, so the next step is to start receiving items from the pipeline. We can do this by starting the process block. Note that the process block is executed for each item we find in the pipeline. Meanwhile, if you did not execute in the pipeline it is executed once for the script as a whole. What this means is that we need to assume that $ObjectList will either be a single item or multiple items. The best way to do that is to use a ForEach Loop.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
Process{ Write-Progress -Activity "Preloading threads" -Status "Starting Job $($jobs.count)" ForEach ($Object in $ObjectList){ If ($Code -eq $Null){ $PowershellThread = [powershell]::Create().AddCommand($Command) }Else{ $PowershellThread = [powershell]::Create().AddScript($Code) } If ($InputParam -ne $Null){ $PowershellThread.AddParameter($InputParam, $Object.ToString()) | out-null }Else{ $PowershellThread.AddArgument($Object.ToString()) | out-null } ForEach($Key in $AddParam.Keys){ $PowershellThread.AddParameter($Key, $AddParam.$key) | out-null } ForEach($Switch in $AddSwitch){ $Switch $PowershellThread.AddParameter($Switch) | out-null } $PowershellThread.RunspacePool = $RunspacePool $Handle = $PowershellThread.BeginInvoke() $Job = "" | Select-Object Handle, Thread, object $Job.Handle = $Handle $Job.Thread = $PowershellThread $Job.Object = $Object.ToString() $Jobs += $Job } } |
So now we have to build the thread that we are going to execute. We do this by adding either the command, or the script. The first IF block is to determine which. A thread either takes an existing PowerShell command, or a Scriptblock. If you remember we built this out in the Begin statement above. Once that is done we need to start giving the user the power to control the item we are calling.
First things first we look at $InputParam. This is what allows the user to execute the child script not just with the argument provided, but also specify the parameter. We see this is useful with the Get-Process cmdlet. Let’s say that you want to see the processes running on 20 different servers. If you just ran Get-Process ServerName you would be looking at your local machine for any processes with the name “ServerName” and you would get no return (probably). Instead you would want to run Get-Process –ComputerName ServerName. The trick here is that when you do this you’ve actually changed things! When an item is just hanging at the end of the statement, it is called an Argument. When you pair the item you are setting with the setting it is called a Parameter. So if the user wants to specify the parameter name, we are actually adding a different item to our thread!
Now we need to see if the user wanted to add some extra parameters. For this I decided that a hash table was perfect. This is because they are built much like a parameter, as they are name / value pairs. The user can provide as many as they want in a single hash table, and we can easily run a ForEach to evaluate this. Again we are going to use the .AddParameter() statement to evaluate.
Finally we need to add any Switches that the user wants to add. A good example of this is the –Force switch on cmdlets like Get-ChildItem. We can do this with an array of string which again allows the user to put in as many various switches as they like. One interesting not here is that I had to use .AddParameter() for this as well. Instead of creating a method called .AddSwitch(), Microsoft simply chose to add an overloaded definition of .AddParameter(). What’s peculiar is that even in their documentation (link below) it does in fact say that providing just a string adds a switch instead of parameter.
Now that we have out thread all set up we simply attach our RunspacePool that we setup in the begin statement, and then tell it to execute our thread!
To avoid having a thread hanging around that we lose track of we need to try to do some tracking here. To do this we first catch the thread by creating the output of the command in a $Handle variable. This will provide the link back to the handle in our “End” block. I then go ahead and create a custom object which I have named $Job to hold all these little gems of knowledge. Then I add my custom object to the array for tracking called $Jobs. This method of creating objects was taught to me by my reader from this post!
At this point all of the code to start the jobs is complete! Now we just have to grab all of the jobs back. That calls for the “End” block which is executed once per script run. Since that is the case, we need one main loop to ensure that it stay running while we need it to. Within the “End” block we will watch for jobs to finish and provide that output as they complete.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
End{ $ResultTimer = Get-Date While (@($Jobs | Where-Object {$_.Handle -ne $Null}).count -gt 0) { $Remaining = "$($($Jobs | Where-Object {$_.Handle.IsCompleted -eq $False}).object)" If ($Remaining.Length -gt 60){ $Remaining = $Remaining.Substring(0,60) + "..." } Write-Progress ` -Activity "Waiting for Jobs - $($MaxThreads - $($RunspacePool.GetAvailableRunspaces())) of $MaxThreads threads running" ` -PercentComplete (($Jobs.count - $($($Jobs | Where-Object {$_.Handle.IsCompleted -eq $False}).count)) / $Jobs.Count * 100) ` -Status "$(@($($Jobs | Where-Object {$_.Handle.IsCompleted -eq $False})).count) remaining - $remaining" ForEach ($Job in $($Jobs | Where-Object {$_.Handle.IsCompleted -eq $True})){ $Job.Thread.EndInvoke($Job.Handle) $Job.Thread.Dispose() $Job.Thread = $Null $Job.Handle = $Null $ResultTimer = Get-Date } If (($(Get-Date) - $ResultTimer).totalseconds -gt $MaxResultTime){ Write-Error "Child script appears to be frozen, try increasing MaxResultTime" Exit } Start-Sleep -Milliseconds $SleepTimer } $RunspacePool.Close() | Out-Null $RunspacePool.Dispose() | Out-Null } |
The first part of this is simply to provide some form of pretty output. First, I evaluate what jobs are still running and creating a string (truncated) that names them. I added this so that if the child script is failing on just one or two servers, you will known which they are to fix them. Then a rather complex write-progress statement which I’ll let you look into. For More info on write-progress you can read my blog articles over write-progress or getting the progress of a child job.
Okay, so now we look for jobs that are completed. We can do this by looking into our array of objects where our handle (that we captured above) has .IsCompleted set to $true! We then run a ForEach on each of these to stop it running and dispose of it. Keep in mind that dispose returns all of the output from the thread. What this means is that the script will actually write all of the output as jobs are finished instead of having to wait for all jobs to finish!
Next I added some protection to the script. I had a couple of scripts in my arsenal that would lock up and never finish. When this happened the multithreading script would continue to run and loop forever and ever. To stop this I added a maximum time to wait for additional jobs to finish. To do this I simply look at the system clock for when the last job completed and look at the time gap. If it is greater than out parameter for $MaxResultTime then we’ll throw an error and exit. Note that until PowerShell actually closes those threads will continue to hang around!
As a very last step we clean up our $RunspacePool.
Well that’s all folks! I really hope that this script provide many time saving events for my wonderful readers. I know it has saved me hundreds of man hours!
Following is the full script with the comment block intact for your cutting and pasting pleasure! Note that you can use the advanced controls here to pop out to a new window or show plain code for copy and paste.
|
#.Synopsis # This is a quick and open-ended script multi-threader searcher # #.Description # This script will allow any general, external script to be multithreaded by providing a single # argument to that script and opening it in a seperate thread. It works as a filter in the # pipeline, or as a standalone script. It will read the argument either from the pipeline # or from a filename provided. It will send the results of the child script down the pipeline, # so it is best to use a script that returns some sort of object. # # Authored by Ryan Witschger - http://www.Get-Blog.com # #.PARAMETER Command # This is where you provide the PowerShell Cmdlet / Script file that you want to multithread. # You can also choose a built in cmdlet. Keep in mind that your script. This script is read into # a scriptblock, so any unforeseen errors are likely caused by the conversion to a script block. # #.PARAMETER ObjectList # The objectlist represents the arguments that are provided to the child script. This is an open ended # argument and can take a single object from the pipeline, an array, a collection, or a file name. The # multithreading script does it's best to find out which you have provided and handle it as such. # If you would like to provide a file, then the file is read with one object on each line and will # be provided as is to the script you are running as a string. If this is not desired, then use an array. # #.PARAMETER InputParam # This allows you to specify the parameter for which your input objects are to be evaluated. As an example, # if you were to provide a computer name to the Get-Process cmdlet as just an argument, it would attempt to # find all processes where the name was the provided computername and fail. You need to specify that the # parameter that you are providing is the "ComputerName". # #.PARAMETER AddParam # This allows you to specify additional parameters to the running command. For instance, if you are trying # to find the status of the "BITS" service on all servers in your list, you will need to specify the "Name" # parameter. This command takes a hash pair formatted as follows: # # @{"ParameterName" = "Value"} # @{"ParameterName" = "Value" ; "ParameterTwo" = "Value2"} # #.PARAMETER AddSwitch # This allows you to add additional switches to the command you are running. For instance, you may want # to include "RequiredServices" to the "Get-Service" cmdlet. This parameter will take a single string, or # an aray of strings as follows: # # "RequiredServices" # @("RequiredServices", "DependentServices") # #.PARAMETER MaxThreads # This is the maximum number of threads to run at any given time. If resources are too congested try lowering # this number. The default value is 20. # #.PARAMETER SleepTimer # This is the time between cycles of the child process detection cycle. The default value is 200ms. If CPU # utilization is high then you can consider increasing this delay. If the child script takes a long time to # run, then you might increase this value to around 1000 (or 1 second in the detection cycle). # # #.EXAMPLE # Both of these will execute the script named ServerInfo.ps1 and provide each of the server names in AllServers.txt # while providing the results to the screen. The results will be the output of the child script. # # gc AllServers.txt | .\Run-CommandMultiThreaded.ps1 -Command .\ServerInfo.ps1 # .\Run-CommandMultiThreaded.ps1 -Command .\ServerInfo.ps1 -ObjectList (gc .\AllServers.txt) # #.EXAMPLE # The following demonstrates the use of the AddParam statement # # $ObjectList | .\Run-CommandMultiThreaded.ps1 -Command "Get-Service" -InputParam ComputerName -AddParam @{"Name" = "BITS"} # #.EXAMPLE # The following demonstrates the use of the AddSwitch statement # # $ObjectList | .\Run-CommandMultiThreaded.ps1 -Command "Get-Service" -AddSwitch @("RequiredServices", "DependentServices") # #.EXAMPLE # The following demonstrates the use of the script in the pipeline # # $ObjectList | .\Run-CommandMultiThreaded.ps1 -Command "Get-Service" -InputParam ComputerName -AddParam @{"Name" = "BITS"} | Select Status, MachineName # Param($Command = $(Read-Host "Enter the script file"), [Parameter(ValueFromPipeline=$true,ValueFromPipelineByPropertyName=$true)]$ObjectList, $InputParam = $Null, $MaxThreads = 20, $SleepTimer = 200, $MaxResultTime = 120, [HashTable]$AddParam = @{}, [Array]$AddSwitch = @() ) Begin{ $ISS = [system.management.automation.runspaces.initialsessionstate]::CreateDefault() $RunspacePool = [runspacefactory]::CreateRunspacePool(1, $MaxThreads, $ISS, $Host) $RunspacePool.Open() If ($(Get-Command | Select-Object Name) -match $Command){ $Code = $Null }Else{ $OFS = "`r`n" $Code = [ScriptBlock]::Create($(Get-Content $Command)) Remove-Variable OFS } $Jobs = @() } Process{ Write-Progress -Activity "Preloading threads" -Status "Starting Job $($jobs.count)" ForEach ($Object in $ObjectList){ If ($Code -eq $Null){ $PowershellThread = [powershell]::Create().AddCommand($Command) }Else{ $PowershellThread = [powershell]::Create().AddScript($Code) } If ($InputParam -ne $Null){ $PowershellThread.AddParameter($InputParam, $Object.ToString()) | out-null }Else{ $PowershellThread.AddArgument($Object.ToString()) | out-null } ForEach($Key in $AddParam.Keys){ $PowershellThread.AddParameter($Key, $AddParam.$key) | out-null } ForEach($Switch in $AddSwitch){ $Switch $PowershellThread.AddParameter($Switch) | out-null } $PowershellThread.RunspacePool = $RunspacePool $Handle = $PowershellThread.BeginInvoke() $Job = "" | Select-Object Handle, Thread, object $Job.Handle = $Handle $Job.Thread = $PowershellThread $Job.Object = $Object.ToString() $Jobs += $Job } } End{ $ResultTimer = Get-Date While (@($Jobs | Where-Object {$_.Handle -ne $Null}).count -gt 0) { $Remaining = "$($($Jobs | Where-Object {$_.Handle.IsCompleted -eq $False}).object)" If ($Remaining.Length -gt 60){ $Remaining = $Remaining.Substring(0,60) + "..." } Write-Progress ` -Activity "Waiting for Jobs - $($MaxThreads - $($RunspacePool.GetAvailableRunspaces())) of $MaxThreads threads running" ` -PercentComplete (($Jobs.count - $($($Jobs | Where-Object {$_.Handle.IsCompleted -eq $False}).count)) / $Jobs.Count * 100) ` -Status "$(@($($Jobs | Where-Object {$_.Handle.IsCompleted -eq $False})).count) remaining - $remaining" ForEach ($Job in $($Jobs | Where-Object {$_.Handle.IsCompleted -eq $True})){ $Job.Thread.EndInvoke($Job.Handle) $Job.Thread.Dispose() $Job.Thread = $Null $Job.Handle = $Null $ResultTimer = Get-Date } If (($(Get-Date) - $ResultTimer).totalseconds -gt $MaxResultTime){ Write-Error "Child script appears to be frozen, try increasing MaxResultTime" Exit } Start-Sleep -Milliseconds $SleepTimer } $RunspacePool.Close() | Out-Null $RunspacePool.Dispose() | Out-Null } |
Further Reading:
Initial Session State:
http://msdn.microsoft.com/en-us/library/system.management.automation.runspaces.initialsessionstate%28v=vs.85%29.aspx
Runspaces:
http://msdn.microsoft.com/en-us/library/System.Management.Automation.Runspaces.Runspace(v=vs.85).aspx
PowerShell Class:
http://msdn.microsoft.com/en-us/library/system.management.automation.powershell%28v=vs.85%29.aspx
PowerShell AddParameter Method:
http://msdn.microsoft.com/en-us/library/system.management.automation.powershell.addparameter%28v=vs.85%29.aspx