Number of Threads higher than the Number of Cores-CodePudding

#include <iostream>
#include <thread>
#include <unistd.h>
using namespace std;

void taskA() {
    for(int i = 0; i < 10;   i) {
        sleep(1);
        printf("TaskA: %d\n", i*i);
        fflush(stdout);
    }
}
void taskB() {
    for(int i = 0; i < 10;   i) {
        sleep(1);
        printf("TaskB: %d\n", i*i);
        fflush(stdout);
    }
}
void taskC() {
    for(int i = 0; i < 10;   i) {
        sleep(1);
        printf("TaskC: %d\n", i*i);
        fflush(stdout);
    }
}
void taskD() {
    for(int i = 0; i < 10;   i) {
        sleep(1);
        printf("TaskD: %d\n", i*i);
        fflush(stdout);
    }
}
void taskE() {
    for(int i = 0; i < 10;   i) {
        sleep(1);
        printf("TaskE: %d\n", i*i);
        fflush(stdout);
    }
}
void taskF() {
    for(int i = 0; i < 10;   i) {
        sleep(1);
        printf("TaskF: %d\n", i*i);
        fflush(stdout);
    }
}
void taskG() {
    for(int i = 0; i < 10;   i) {
        sleep(1);
        printf("TaskG: %d\n", i*i);
        fflush(stdout);
    }
}
void taskH() {
    for(int i = 0; i < 10;   i) {
        sleep(1);
        printf("TaskH: %d\n", i*i);
        fflush(stdout);
    }
}
int main(void) {
    thread t1(taskA);
    thread t2(taskB);
    thread t3(taskC);
    thread t4(taskD);
    thread t5(taskE);
    thread t6(taskF);
    thread t7(taskG);
    thread t8(taskH);
    t1.join();
    t2.join();
    t3.join();
    t4.join();
    t5.join();
    t6.join();
    t7.join();
    t8.join();
    return 0;
}

In this C program, I created 8 similar functions (taskA to taskH) and 8 threads, one for each. When I executed, I got outputs of all the 8 functions parallely. But my Laptop has only 4 cores.

So the problem is how is it happening? 4 cores running 8 threads parallely, I didn't understand it! Please explain what's happening inside?

Thanks for your explanation!

CodePudding user response：

Each core can run 1 thread at a time, or 2 threads in parallel if hyperthreading is enabled. So, on a system with 4 cores installed, there can be 4 or 8 threads running in parallel, max.

However, even so, your app is not the only one running threads. Every running process has at least 1 thread, maybe more. And the OS itself has dozens, maybe hundreds, of threads running. So clearly way way more threads total than the number of cores that are installed.

So, the OS has a built-in scheduler that is actively scheduling all of these running threads in such a way that cores will switch between threads at regular intervals, known as "time slices". This scheduling process is commonly known as "task switching".

This means that when a time slice on a core elapses, the core will temporarily pause the thread currently running on it, save that thread's state, and then resume an earlier paused thread for the next time slice, then pause and save that thread, switch to another thread, and so on, dozens/hundreds of times a second. Spread out over however many cores are installed.

Most systems are not real-time, so true parallel processing is just an illusion. Just a lot of switching between threads as core time slices become available.

That is it, in a nutshell. Obviously, things are more complex in practical use. There are lengthy articles, research papers, even books, on this topic, if you really want to know the gritty implementation details.

CodePudding user response：

A bit about human physiology. Every person in the world has this thing called "reaction time". It is time between an event actually occuring and your brain realizing it and reacting to it. It varies between people, but it is extremely rare for a human to have a reaction time below 100ms, and below 10ms is unheard of. This means that if two events happens one after another, but the time between them is less than 10ms then a human will think that these events happened simultaneously.

What does it have to do with threading and cores? A CPU can only run N threads in parallel, where N is the number of cores (well, technically it is more complicated due to features like hyperthreading, but lets forget about it for now). So when you fire, say 10*N threads, then these cannot run in parallel. It is technically impossible. What actually happens is that the operating system has this internal piece of code called the scheduler, which controls which threads runs on which core at a given moment. And it jumps from one thread to another, so that every thread has some CPU time and can actually progress.

But you say "outputs of all the functions are coming at a same time". No they don't. The CPU processes billions of instructions per second. Or equivalently one or more instructions every 1/1bln second. The exact number depends on what exactly the CPU does, for example printing stuff to a monitor requires much more time, but still it can print probably thousands or tens of thousands characters into monitor below 10ms. And since this is below your reaction time, you only think that it happened in parallel, while in reality it did not.

and I am also using a sleep of 1sec

Sleep is not an action. It is a lack of action. And as such does not require CPU. The operations of "falling asleep" and "waking up" require some CPU (even though very little), but not waiting itself. And indeed, waiting does happen truely parallely, regardless of how many threads you have.