I am reading up on processors, and was under the impression that the number of processes that could be spun up was related to the number of cores of your computer, and that the CPU to run tasks was allocated to each processor (which can then spin up 1 or many threads, which can share this processing power to achieve concurrency). You could also achieve parallelism by running multiple tasks on multiple processes, since they have their own CPU they can run in parallel.
This was my assumption, but when looking at the Task Manager: Task Manager CPU
In this case, over 200 processes exist!
What does this mean if we go back to my previous understanding of processes? Could we still achieve parallelism here? And what is the overhead of spawning this many processes versus what we gain from it?
CodePudding user response:
The number of process running on a machine can be bigger than the number of core or the number of processor. There is no direct link between the two.
Processes can have one or multiple software threads that are scheduled on processing units by the operating system (OS). Such processing units are called hardware threads. 1 hardware thread can only execute 1 thread at a time. The OS use preemption so to execute multiple software threads on the same hardware thread. More specifically, each software thread is scheduled for a given time (eg. called quantum, lasting few milliseconds on Windows). The OS performs a context switch when the quantum time is over, or if the software thread perform blocking operations (eg. waiting for another software thread, reading a file, etc.), or also if a thread with a higher priority becomes ready (eg. a blocking operation is ready, as mentioned by @MartinJames). The exact algorithm can change from one platform to another. For Windows, you can get some information about that in the Microsoft documentation.
On most processors cores have 2 hardware threads (on Intel processor, this is called Hyper-Threading). This means each core can execute two software threads truly in parallel (though they share some hardware resources). The number of threads that can truly run in parallel is defined by the number of hardware threads. The Task Manager reports this as "logical processors". 16 threads can run in parallel on your machine but more can run concurrently thanks to preemption.
Note that most software threads are generally waiting and thus they are not scheduled by the OS on hardware threads. The OS will wake them up only when needed. Consequently, having 200 processes is not an issue as long as they are not actively running (to be more precise, their software threads).
Note that processes are not bound to a given processor. One process can have several threads executed on multiple processors.
Spawning a process is quite expensive (as opposed to a software thread). It generally take from 100 us to 10 ms (regarding the OS, the target machine and the attached program to run). Spawning a software thread is much faster. It generally does not take more than 1 ms (but again this is very dependant of the platform). Spawning multiple software threads is required to use all the available cores/hardware-threads.