Home > Software engineering >  Understanding python multiprocessing pool map thread safety
Understanding python multiprocessing pool map thread safety

Time:12-28

This question had conflicting answers: Are Python multiprocessing Pool thread safe?

I am new to concurrency patterns and I am trying to run a project that takes in an array and distributes the work of the array onto multiple processes. The array is large.

inputs = range(100000)
with Pool(2) as pool:
  res = pool.map(some_func, inputs)

My understanding is that pool will distribute tasks to the processes. My questions are:

  1. Is this map operation thread safe? Will two processes ever accidentally try to process the same value?
  2. I superficially understand that tasks will be divide up into chunks and sent to processes. However, if different inputs take more time than others, will the work always be evenly distributed across my processes? Will I ever be in a scenario where one process is hanging but has a long queue of tasks to do while other processes are idle?
  3. My understanding is that since I am just reading inputs in, I don't need to use any interprocess communication paterns like a server manager / shared memory. Is that right?
  4. If I set up more processes than cores, will it basically operate like threads where the CPU is switching between tasks?

Thank you!

CodePudding user response:

  1. With the code provided, it is impossible that the same item of inputs will be processed by more than one process (an exception would be if the same instance of an object appears more than once in the iterable passed as argument). Nevertheless, this way of using multiprocessing has a lot of overhead, since the inputs items are sent one by one to the processes. A better approach is to use the chunksize parameter:
inputs = range(100000)
n_proc = 2
chunksize = len(inputs)//n_proc
if len(inputs) % n_proc:
  chunksize  = 1
with Pool(nproc) as pool:
  res = pool.map(some_func, inputs, chunksize=chunksize)

this way, chunks of inputs are passed at once to each process, leading to a better performance.

  1. The work is not divided in chunks unless you ask so. If no chunksize is provided, each chunk is one item from the iterable (the equivalent of chunksize=1). Each chunk will be 'sent' one by one to the available processes in the pool. The chunks are sent to the processes as they finish working on the previous one and become available. There is no need for every process to take the same number of chunks. In your example, if some_func takes longer for larger values and chunksize = len(items)/2 the process that gets the chunk with the first half of inputs (with smaller values) will finish first while the other takes much longer. In that case, a smaller chunk is a better option so the work is evenly distributed.

  2. This depends on what some_func does. If you do not need the result of some_func(n) to process some_func(m), you do not need to communicate between processes. If you are using map and need to communicate between processes, it is very likely that you are taking a bad approach to solving your problem.

  3. if max_workers > os.cpu_count() the CPU will switch between processes more often than with a lower number of processes. Don't forget that there are many more processes running in a (not amazingly old) computer than your program. In windows, max_workers must be equal or less than 61 (see the docs here)

  • Related