I need some help with controlling subprocesses. I am really not sure which direction of code I need to do/research on as well.
As of the moment, I am controlling the running of a software using python as seen in the code below.
import subprocess
# Declaring some input and output names for the third-party program
label = ['fname1', 'fname2', 'fname3', 'fname4']
input_list = [ 'inp.' l for l in label ]
output_list = [ 'out.' l for l in label ]
# Run all sets of input and output using the third-party software
for in, out in zip(input_list, output_list):
#The bash command to run the executable
command = f"mpirun pw.x < {in} > {out}"
subprocess.run(command, shell=True)
My computer has 64 logical cores and as I've tested with the software, using 32 and 64 doesn't change the speed of the calculation, hence, I would like to edit the code to accomodate two concurrent subprocess.run's with mpirun -n 32 ...
.
I don't know how to do the concurrent stuff, like queue-ing and controlling how many instance of the subprocess is allowed to run at a given time.
May I ask which module/library will help me get this done ? of course a code sample will be very much appreciated.
P.S. PBS/SLURM systems are not an option because I am also doing some processing stuff within the python script.
Thanks!
CodePudding user response:
Try this if you want to run mpirun -n 32
twice in parallel:
import concurrent.futures
labels = ['fname1', 'fname2']
with concurrent.futures.ProcessPoolExecutor(max_workers=len(labels)) as pool:
for label in labels:
command = f"mpirun -n 32 pw.x < inp.{label} > out.{label}"
pool.submit(subprocess.run, command, shell=True)
The with
statement will close the pool at the end, which makes it wait for all the jobs to complete.