Home > Net >  Thread Pool with threads having DUPLICATE names
Thread Pool with threads having DUPLICATE names

Time:02-17

I am learning python multithreading and taking a stab at using concurrent.futures.ThreadPoolExecutor.

I created two ThreadPoolExecutor functions that use executor.map() method. Both function are identical in that each will create 10 future objects (or threads). However, one function is passing the max_worker=5 argument, while the other does not. Both functions execute another function named 'do_futures' that assigns a name for each thread in the the thread pool using threading.current_thread().getname().

Here is my code:

import concurrent.futures
import threading
import time

# timer functions
def start_timer():
    return time.perf_counter()

def total_time(started):
    finished = time.perf_counter()
    return f'Total execution time is {round(finished-started,2)} second(s)'

# assigns name to each thread in a thread pool
def do_futures(seconds):
    print(f'{threading.currentThread().getName()} sleeping begin {seconds} second(s)...')
    time.sleep(seconds)
    return f'{threading.currentThread().getName()} done sleeping for {seconds} seconds.'


def threadpool_exec_map():

    start = start_timer()

    with concurrent.futures.ThreadPoolExecutor() as executor:
        secs = [5, 4, 3, 2, 1, 2, 3 ,4, 1, 5]
        results = executor.map(do_futures, secs)

        for result in results:
            print(result)

    print(f'{total_time(start)}')

def threadpool_exec_map2(max_workers):
   
    start = start_timer()
    print(f'\n{max_workers = }\n')

    with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
        secs = [5, 4, 3, 2, 1, 2, 3 ,4, 1, 5]
        results = executor.map(do_futures, secs)

        for result in results:
            print(result)

    print(f'{total_time(start)}')

# ------ main ----------- #
if __name__ == '__main__':

    threadpool_exec_map()               # concurrent.futures ThreadPoolExecutor() - NO max_workers passed
    threadpool_exec_map2(5)              # concurrent.futures ThreadPoolExecutor() - passing (max_workers=5) argument

Output for the threadpool_exec_map() is as follows (ALL showing UNIQUE names, no issue)

hreadPoolExecutor-0_0 sleeping begin 5 second(s)...
ThreadPoolExecutor-0_1 sleeping begin 4 second(s)...
ThreadPoolExecutor-0_2 sleeping begin 3 second(s)...
ThreadPoolExecutor-0_3 sleeping begin 2 second(s)...
ThreadPoolExecutor-0_4 sleeping begin 1 second(s)...
ThreadPoolExecutor-0_5 sleeping begin 2 second(s)...
ThreadPoolExecutor-0_6 sleeping begin 3 second(s)...
ThreadPoolExecutor-0_7 sleeping begin 4 second(s)...
ThreadPoolExecutor-0_8 sleeping begin 1 second(s)...
ThreadPoolExecutor-0_9 sleeping begin 5 second(s)...
ThreadPoolExecutor-0_0 done sleeping for 5 seconds.
ThreadPoolExecutor-0_1 done sleeping for 4 seconds.
ThreadPoolExecutor-0_2 done sleeping for 3 seconds.
ThreadPoolExecutor-0_3 done sleeping for 2 seconds.
ThreadPoolExecutor-0_4 done sleeping for 1 seconds.
ThreadPoolExecutor-0_5 done sleeping for 2 seconds.
ThreadPoolExecutor-0_6 done sleeping for 3 seconds.
ThreadPoolExecutor-0_7 done sleeping for 4 seconds.
ThreadPoolExecutor-0_8 done sleeping for 1 seconds.
ThreadPoolExecutor-0_9 done sleeping for 5 seconds.
Total execution time is 5.0 second(s)

However, the threadpool_exec_map2(5) output below shows that three threads in the threadpool are assigned the same thread name ThreadPoolExecutor-1_4:

max_workers = 5

ThreadPoolExecutor-1_0 sleeping begin 5 second(s)...
ThreadPoolExecutor-1_1 sleeping begin 4 second(s)...
ThreadPoolExecutor-1_2 sleeping begin 3 second(s)...
ThreadPoolExecutor-1_3 sleeping begin 2 second(s)...
ThreadPoolExecutor-1_4 sleeping begin 1 second(s)...  < original
ThreadPoolExecutor-1_4 sleeping begin 2 second(s)...  < duplicate name
ThreadPoolExecutor-1_3 sleeping begin 3 second(s)...
ThreadPoolExecutor-1_2 sleeping begin 4 second(s)...
ThreadPoolExecutor-1_4 sleeping begin 1 second(s)...  < third thread with same name
ThreadPoolExecutor-1_1 sleeping begin 5 second(s)...
ThreadPoolExecutor-1_0 done sleeping for 5 seconds.
ThreadPoolExecutor-1_1 done sleeping for 4 seconds.
ThreadPoolExecutor-1_2 done sleeping for 3 seconds.
ThreadPoolExecutor-1_3 done sleeping for 2 seconds.
ThreadPoolExecutor-1_4 done sleeping for 1 seconds.
ThreadPoolExecutor-1_4 done sleeping for 2 seconds.
ThreadPoolExecutor-1_3 done sleeping for 3 seconds.
ThreadPoolExecutor-1_2 done sleeping for 4 seconds.
ThreadPoolExecutor-1_4 done sleeping for 1 seconds.
ThreadPoolExecutor-1_1 done sleeping for 5 seconds.
Total execution time is 9.01 second(s)

When the do_futures is called by the first function threadpool_exec_map() - which does NOT pass any argument - it assigns unique names to each thread in the threadpool as expected.

But when do_futures is called by the threadpool_exec_map2(5) function it assigned three thread with the same name.

I believe passing the max_workers=5 argument to the threadpool_exec_map() function is the culprit, but couldn't explain it from the output. How could three threads with the same name "start sleeping concurrently"? Shouldn't the first thread with that name be "completed" first before assigning the same name to the next thread, and so on?

Input appreciated.

CodePudding user response:

What you are missing is that time.sleep releases the worker thread to do other work. There is no reason to have a CPU thread just idly looping. While your code sleeps, the thread is used to do work elsewhere.

With a threadpool of size 5, it will kick off 5 threads. They all pretty instantly go to sleep, at which point the threads can be used to work on the other jobs (sometimes called tasks). It sleeps again, so can be used a 3rd time, but it is essentially random which of the 5 sleeping threads will be used.

If you want to see it work differently, try doing some work instead of using sleep, like calculate fibonacci(50) or something.

CodePudding user response:

I think I may have found a good explanation to my question.

Per the python ThreadPool link here it says that the ThreadPool does reuse an idle worker thread as Garr Godfrey indicated in his answer.

Looking at the thread names - there are only 5 unique worker thread names and this is because of max_workers=5 argument.

To explain the duplicate names, the ThreadPoolExecutor-1_4 has a sleep timer of one, so it finishes up quickly and the worker thread is reused to do the next task, which explains why the worker thread name repeats twice more.

Also notice that ThreadPoolExecutor-1_1 (4 seconds), ThreadPoolExecutor-1_2 (sleep 3 secs) and ThreadPoolExecutor-1_3 (sleeps 2 secs)` worker threads are also re-used by the thread pool (though they're only being re-used once each).

  • Related