How many processors should be used with multiprocessing.Pool?-CodePudding

I am trying to use multiprocessing.Pool to run my code in parallel. To instantiate Pool, you have to set the number of processes. I am trying to figure out how many I should set for this. I understand this number shouldn't be more than the number of cores you have but I've seen different ways to determine what your system has available.

2 Methods:

multiprocessing.cpu_count()
len(os.sched_getaffinity(0))

I'm a little confused; what is the difference between the two and which should be implemented with Pool? I am working on a remote cluster, with the first, it outputs that there are 128 cpu, but the second gives 10.

CodePudding user response：

The difference between the two is clearly stated in the doc:

multiprocessing.cpu_count() Return the number of CPUs in the system.

This number is not equivalent to the number of CPUs the current process can use. The number of usable CPUs can be obtained with len(os.sched_getaffinity(0)).

So even if you are on a 128-core system, your program could have been somehow limited to only run on a specific set of 10 out of the 128 available CPUs. Since affinity also applies to child threads and processes, it doesn't make much sense to spawn more than 10. You could however try to increase the number of available CPUs through os.sched_setaffinity() before starting your pool.

import os
import multiprocessing as mp

cpu_count = mp.cpu_count() 

if len(os.sched_getaffinity(0)) < cpu_count:
    try:
        os.sched_setaffinity(0, range(cpu_count))
    except OSError:
        print('Could not set affinity')

n = max(len(os.sched_getaffinity(0)), 96)
print('Using', n, 'processes for the pool')

pool = mp.Pool(n)
# ...

See also man 2 sched_setaffinity.

CodePudding user response：

First of all, keep in mind that cpu_count() returns the number of virtual CPUs (this can be larger than the number of physical CPUs in case each CPU supports multiple threads. To see the number of physical CPUs use:

psutil.cpu_count(logical=False)

Anyway, with psutil.cpu_count() you get the actual number of virtual CPUs, that is also the maximum possible number of concurrent threads you can have on your system.

With

os.sched_getaffinity(0) # same as the default os.sched_getaffinity()

(where 0 is the current process) you get the number of CPUs available to the current process. You can change that with:

os.sched_setaffinity(0,[1,2,3])

Here for instance you tell the process to use 3 CPUs, namely: 1, 2, and 3.

Note that if you set Pool to use the maximum available number of CPUs you won't have the maximum parallelism anyway, since some CPUs will always be busy with operating the system. Similarly in a multi-user environment you are likely not going to achieve the parallelism set by the number of threads in the pool.

Scheduling engines like SLURM or YARN can guarantee that a process gets a certain number of CPUs and therefore the desired parallelism.