Home > Enterprise >  Multithreaded Python Program faster than Single Threaded program for CPU bound task
Multithreaded Python Program faster than Single Threaded program for CPU bound task

Time:08-21

EDIT : Turns out this weird behavior was happening only with python in my WSL ubuntu. Otherwise, sequential does run faster than multi-threaded one.

I understand that, for CPython, in general, multiple-threads just context-switch while utilizing the same CPU-core and not utilize multiple CPU-cores like with multi-processing where several instances of python interpreter gets started.

I know this makes multithreading good for I/O bound tasks if done right. Nevertheless, CPU bound tasks will actually be slower with multi-threading. So, I experimented with 3 code snippets each doing some CPU bound calculations.

  • Example 1 : Runs tasks in sequence (single thread)
  • Example 2 : Runs each task in different thread (Multithreaded)
  • Example 3 : Runs each task in separate processes (Multi-processed)

To my surprise, even though task is CPU bound, Example 2 utilizing multiple threads is executing faster (on avg 1.5 secs) than Example 1 using single thread (on avg 2.2 secs). But Example 3 runs the fastest as expected (on avg 1 sec).

I don't know what I am doing wrong.

Example 1 : Run tasks Sequentially

import time 
import math

nums = [ 8, 7, 8, 5, 8]

def some_computation(n):
    counter = 0
    for i in range(int(math.pow(n,n))):
        counter  = 1

if __name__ == '__main__':
    start = time.time()
    for i in nums:
        some_computation(i)
    end = time.time()
    print("Total time of program execution : ", round(end-start, 4) )

Example 2 : Run tasks with Multithreading

import threading
import time 
import math

nums = [ 8, 7, 8, 5, 8]
def some_computation(n):
    counter = 0
    for i in range(int(math.pow(n,n))):
        counter  = 1
    
if __name__ == '__main__':
    start = time.time()
    threads = []
    for i in nums: 
        x = threading.Thread(target=some_computation, args=(i,))
        threads.append(x)
        x.start()
    for t in threads:
        t.join()
    end = time.time()
    print("Total time of program execution : ", round(end-start, 4) )

Example 3 : Run tasks in parallel with multiprocessing module

from multiprocessing import Pool
import time
import math

nums = [ 8, 7, 8, 5, 8]
def some_computation(n):
    counter = 0
    for i in range(int(math.pow(n,n))):
        counter  = 1

if __name__ == '__main__':
    start = time.time()
    pool = Pool(processes=3)
    for i in nums:
        pool.apply_async(some_computation, [i])
    pool.close()
    pool.join()
    end = time.time()
    print("Total time of program execution : ", round(end-start, 4) )

CodePudding user response:

As stated in my comment, it's a question of what the function is actually doing.

If we make the nums list longer (i.e., there will be more concurrent threads/processes) and also adjust the way the loop range is calculated then we see this:

import time 
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor

nums = [8,7,8,5,8,8,5,4,8,7,7,8,8,7,8,8,8]

def some_computation(n):
    counter = 0
    for _ in range(n*1_000_000):
        counter  = 1
    return counter

def sequential():
    for n in nums:
        some_computation(n)

def threaded():
    with ThreadPoolExecutor() as executor:
        executor.map(some_computation, nums)

def pooled():
    with ProcessPoolExecutor() as executor:
        executor.map(some_computation, nums)

if __name__ == '__main__':
    for func in sequential, threaded, pooled:
        start = time.perf_counter()
        func()
        end = time.perf_counter()
        print(func.__name__, f'{end-start:.4f}')

Output:

sequential 4.8998
threaded 5.1257
pooled 0.7760

This indicates that the complexity of some_computation() determines how the system is going to behave. With this code and its adjusted parameters we see that threading is slower than running sequentially (as one would typically expect) and, of course, multiprocessing is significantly faster

CodePudding user response:

Turns out this was happening only in ubuntu that I had installed in Windows Subsystem for Linux. My original snippets runs as expected in Windows or Ubuntu python environment but not in WSL i.e Sequential Execution running faster than Multithreaded one. Thanks @Vlad to double check things on your end.

  • Related