Home > Software design >  How to wait for thread execution to complete before starting new thread?
How to wait for thread execution to complete before starting new thread?

Time:09-30

I have a python code in which I can run a maximum of 10 threads at a time due to GPU and compute limitations. I have 100 folders that I want to process and I want each thread to process one folder. Here is some sample code that I have written to achieve this.

def random_wait(thread_id):
    # print('Inside wait')
    rand_number = random.randint(3, 9)
    # print(f'Random number : {rand_number}')
    print(f'Thread {thread_id} waiting for {rand_number} seconds')
    time.sleep(rand_number)
    print(f'Thread {thread_id} completed execution')

if __name__=='__main__':
    total_runs = 6
    thread_limit = 3
    running_threads = list()
    for i in range(total_runs):
        print(f'Active threads : {threading.active_count()}')
        if threading.active_count() > thread_limit:
            print(f'Active thread count exceeded')
            # check if an existing thread is alive and for it to finish execution
            for running_thread in running_threads:
                if not running_thread.is_alive():
                    # Remove thread 
                    running_threads.remove(running_thread)
                    print(f'Removing thread: {running_thread}')
        else:
            thread = threading.Thread(target=random_wait, args=(i,), kwargs={})
            running_threads.append(thread)
            print(f'Starting thread : {i}')
            thread.start()

In this code, I am checking if the number of active threads exceeds the thread limit that I have specified, and the process refrains from creating new threads unless there's space for one more thread to be executed.

I am able to refrain the process from starting new threads. However, I lose the threads that I wanted to start and the code just ends up starting and stopping the first three threads. How can I achieve starting a new thread/processing as soon as there's space for one more? Is there a better way in which I just start 10 threads, but as soon as one thread completes, I assign it to start processing another folder?

CodePudding user response:

You should use a ThreadPoolExecutor from the Python standard library concurrent.futures, it automatically manages a fixed number of threads. If you need to execute the same function with different arguments in parallel (as in a parallel for-loop), you can use the .map() method:

from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor(10) as e:
    results = e.map(work, (arg_1, arg_2, ..., arg_n))

If you need to schedule different work in parallel you should use the .submit() method:

from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor(10) as e:
    future_1 = e.submit(work_1, arg_1)
    future_2 = e.submit(work_2, arg_2)

result_1 = future_1.result()
result_2 = future_2.result()

In the second case, .submit() returns a Future object which encapsulates the asynchronous execution of the work. You should store that future and get the result when needed. Note that the context manager (with statement) ensures that the .shutdown() method is call before leaving it, so all works are done after this point.

  • Related