I have a program, let's say "main.py", which runs through the argument "python main.py 3" or, for example, "python main.py 47", which means running a specific ID inside the program itself.
I'm trying to write another script, let's say "start.py", so that it starts a certain number of such programs. If inside start.py I have written threads = 4, timeout = 5, then it should run "python main.py 1", "python main.py 2", "python main.py 3", "python main.py 4" at the same time, but with a delay of 5 seconds between each command.
I know how to do this in one thread, but no other arguments are run until the previous one completes.
threads = 4
id = 1
for i in range(threads):
os.system(f"python main.py {id}")
id = 1
time.sleep(5)
I am trying to do this via multiprocessing, but I am failing. What is the best way to implement this, and am I going in the right direction?
I've already done this through bash, but I only need to do it in Python.
for ((i=1; i<=4; i ))
do
python3 main.py "$i" &
done
CodePudding user response:
If you don't want to or can't make changes to main.py, then the simplest change you can make to your current code is to simply execute the system
call in a thread so you do not block:
from threading import Thread
import os
import time
def run_main(id):
os.system(f"python main.py {id}")
threads = 4
id = 1
started_threads = []
for i in range(threads):
if i != 0:
time.sleep(5)
t = Thread(target=run_main, args=(id,))
t.start()
started_threads.append(t)
id = 1
for t in started_threads:
t.join()
Note that I have moved the call to time.sleep
since you were doing an extra call that you did not need.
But this is rather expensive in that you are starting a Python interpreter for each invocation of main
. If I understand the comment offered by @BoarGules (although what he literally said would not run the function main
4 times in parallel but rather sequentially), the following is an alternative implementation if main.py is structured like the following:
import sys
def main(id):
... # process
if __name__ == '__main__':
main(sys.argv[1])
And then your start.py, if running under Linux or some platform that uses fork to start new processes, is coded as follows:
from multiprocessing import Process
import os
import time
import main
threads = 4
id = 1
started_processes = []
for i in range(threads):
if i != 0:
time.sleep(5)
p = Process(target=main.main, args=(id,))
p.start()
started_processes.append(p)
id = 1
for p in started_processes:
p.join()
But if you are running under Windows or some platform that uses spawn to start new processes, then you must code start.py as follows:
from multiprocessing import Process
import os
import time
import main
# required for Windows:
if __name__ == '__main__':
threads = 4
id = 1
started_processes = []
for i in range(threads):
if i != 0:
time.sleep(5)
p = Process(target=main.main, args=(id,))
p.start()
started_processes.append(p)
id = 1
for p in started_processes:
p.join()
And each new Process
instance you create will end up running a new Python interpreter anyway, so you will not be saving much over the initial solution I offered.
This is why when you post a question tagged with multiprocessing
you are supposed to also tag the question with the platform.