As far as my understanding:
- MultiThread is an ideal option for I/O applications.
Therefore, I test a "for loop" code without any I/O. (As following code)
Howerver, it can reduce the execution time from 6.3s to 3.7s.
Is the result correct ?
or any mistake in my suppose ?
from multiprocessing.dummy import Pool as ThreadPool
import time
# normal
l = []
s = time.time()
for i in range(0, 10000):
for j in range(i):
l.append(j * 10)
e = time.time()
print(f"case1: {e-s}") # 6.3 sec
# multiThread
def func(x):
for i in range(x):
l_.append(i * 10)
with ThreadPool(50) as pool:
l_ = []
s = time.time()
pool.map(func, range(0, 10000))
e = time.time()
print(f"case2: {e-s}") # 3.7 sec
CodePudding user response:
In general it is true that multithreading is better suited for I/O bound operations. However, in this trivial case it is clearly not so.
It's worth pointing out multiprocessing will outperform either of the strategies implemented in OP's code.
Here's a rewrite that demonstrates 3 techniques:
from functools import wraps
from time import perf_counter
def timeit(func):
@wraps(func)
def _wrapper(*args, **kwargs):
start = perf_counter()
result = func(*args, **kwargs)
duration = perf_counter() - start
print(f'Function {func.__name__}{args} {kwargs} Took {duration:.4f} seconds')
return result
return _wrapper
@timeit
def case1():
l = []
for i in range(0, 10000):
for j in range(i):
l.append(j * 10)
def process(n):
l = []
for j in range(n):
l.append(j * 10)
@timeit
def case2():
with TPE() as tpe:
tpe.map(process, range(0, 10_000))
@timeit
def case3():
with PPE() as ppe:
ppe.map(process, range(0, 10_000))
if __name__ == '__main__':
for func in case1, case2, case3:
func()
Output:
Function case1() {} Took 3.3104 seconds
Function case2() {} Took 2.6354 seconds
Function case3() {} Took 1.7245 seconds
In this case the trivial processing is probably outweighed by the overheads in thread management. If case1 was even more CPU intensive you'd probably begin to see rather different results
CodePudding user response:
Multi threading is ideal for I/O applications because it allows a server/host to accept multiple connections, and if a single request is slow or hangs, it can continue serving the other connections without blocking them.
That isn't mutually exclusive from speeding up a simple for loop execution, if there's no coordination between threads required like in your trivial example above. If each execution of loop is completely independent of any other executions, then it's also very well suited to multi threading, and that's why you're seeing a speed up.