Home > Net >  Is ThreadPoolExecutor guaranteed to spread N tasks out evenly on N threads?
Is ThreadPoolExecutor guaranteed to spread N tasks out evenly on N threads?

Time:10-06

If I have to do a task 5 times and it would normally take 10-30 seconds per task, will this guarantee that I start all 5 on separate threads right away?

with ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(task, [0, 1, 2, 3, 4]))

If the answer is no, then how would I get the guarantee I want?

CodePudding user response:

Nothing in life is guaranteed, but it's a fairly safe bet that each task will run in distinct threads within the thread pool.

The five submitted tasks go into an input queue of tasks to be run. If the worker function, task, executed very quickly, it would theoretically be possible for one of the threads in the thread pool to pull off from the input task queue all of the tasks and run them one after another before any of the other 4 threads had a chance to run. But given that we have a relatively long-running task, all 5 threads will be given a time slice before any one thread can complete running even one task.

Here is an example of of an extremely I/O-bound task:

from concurrent.futures import ThreadPoolExecutor

def task(x):
    import threading
    import time

    time.sleep(2)
    print(threading.get_ident(), x)

with ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(task, [0, 1, 2, 3, 4]))

Prints:

15492 0
21120 1
3160 2
12548 3
13468 4

And here is an example of an extremely CPU-bound task:

from concurrent.futures import ThreadPoolExecutor

ONE_SECOND_ITERATIONS = 20_000_000

def one_second():
    sum = 0
    for _ in range(ONE_SECOND_ITERATIONS):
        sum  = 1
    return sum

def task(x):
    import threading

    for _ in range(2):
        one_second()
    print(threading.get_ident(), x)

with ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(task, [0, 1, 2, 3, 4]))

Prints:

14660 0
21776 4
3384 3
16596 1
4924 2

Here is the case where task executes extremely quickly and one thread runs all 5 tasks:

from concurrent.futures import ThreadPoolExecutor

def task(x):
    import threading

    return (threading.get_ident(), x)

with ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(task, [0, 1, 2, 3, 4]))
    print(results)

Prints:

[(18104, 0), (18104, 1), (18104, 2), (18104, 3), (18104, 4)]
  • Related