Home > other >  Simple C multithreading example is slower
Simple C multithreading example is slower

Time:09-13

I am currently learning basic C multithreading and I implemented a very small code to learn the concepts. I keep hearing multithreading is faster so I tried the below :

    int main()
{
    //---- SECTION 1
    Timer timer;
    Func();
    Func();
    
    //---- SECTION 2
    Timer timer;
    std::thread t(Func);
    Func();
    t.join();
}

And below is the Timer,

Timer()
    {
        start = std::chrono::high_resolution_clock::now();
    }

    ~Timer()
    {
        end = std::chrono::high_resolution_clock::now();
        duration = end - start;

        //To get duration in MilliSeconds
        float ms = duration.count() * 1000.0f;

        std::cout << "Timer : " << ms << " milliseconds\n";
    }

When I implement Section 1 (the other commented out), I get times of 0.1ms,0.2ms and in that range but when I implement Section 2, I get 1ms and above. So it seems that Section 1 is faster even though it is running on the main thread but the Section 2 seems to be slower.

Your answers would be much appreciated. If I am wrong in regards to any concepts, your help would be helpful.

Thanks in advance.

CodePudding user response:

Multithreading can mean faster, but it does not always mean faster. There are many things you can do in multithreaded code which can actually slow things down!

This example shows one of them. In this case, your Func() is too short to benefit from this simplistic multi threading example. Standing up a new thread involves calls to the operating system to manage these new resources. These calls are quite expensive when compared with the 100-200us of your Func. It adds what are called "context switches," which are how the OS changes from one thread to another. If you used a much longer Func (like 20x or 50x longer), you would start to see the benefits.

How big of a deal are these context switches? Well, if you are CPU bound, doing computations as fast as you can, on every core of the CPU, most OSs like to switch threads every 4 milliseconds. That seems to be a decent tradeoff between responsiveness and minimizing overhead. If you aren't CPU bound (like when you finish your Func calls and have nothing else to do), it will obviously switch faster than that, but it's a useful number to keep in the back of your head when considering the time-scales threading is done at.

If you need to run a large number of things in a multi-threaded way, you are probably looking at a dispatch-queue sort of pattern. In this pattern, you stand up the "worker" thread once, and then use mutexes/condition-variables to shuffle work to the worker. This decreases the overhead substantially, especially if you can queue up enough work such that the worker can do several tasks before context switching over to the threads that are consuming the results.

Another thing to watch out for, when starting on multi threading, is managing the granularity of the locking mechanisms you use. If you lock too coarsely (protecting large swaths of data with a single lock), you can't be concurrent. But if you lock too finely, you spend all of your time managing the synchronization tools rather than doing the actual computations. You don't get benefits from multi threading for free. You have to look at your problem and find the right places to do it.

CodePudding user response:

Your test code is timining the starting of a thread (which is a system call and relatively expensive). Also, 0.1ms is too small to get accurate answers. You should try to get your test code to run at least 5 seconds, but even more if you want accurate results, that might make the thread start-up time less significant.

There are two reasons to run threads, one is to perform work in parallel with other threads thereby minimizing the time to compute, the other is to perform some i/o where it will wait for the the kernel to respond. More modern approaches are to use asynchronous system calls so you don't need to wait.

You might want to use condition variables (google std::condition_variable) or some thread pool library. These will be much faster that spinning up a new thread.

  • Related