Running code in one thread is slower than running the code in main thread-CodePudding

I'm testing running double calculations in a thread and I got this strange result. Running the calculations in the main thread takes almost half the time than running it in a separate thread and calling join in the main thread. If it's a single thread there shouldn't be a big difference from just running the function. Am I doing something wrong?

The cpu is Intel Xeon E-2136 limited at 4.1GHz to have the same boost frequency in independent of how many cores are running.

#include <cstdio>
#include <stdexcept>
#include <thread>
#include <future>
#include <malloc.h>
#include <time.h>

#define TEST_ITERATIONS 1000*1000*1000

void *testNN(void *dummy) {
  volatile double x;
  for (int i = 0; i < TEST_ITERATIONS;   i) {
    x = rand();
    x *= rand();
  }
  return nullptr;
}

int main(){
    time_t start = time(nullptr);

    { // for future to join thread

      testNN(nullptr); // 12s

//      pthread_t thread_id;
//      pthread_create(&thread_id, NULL, testNN, nullptr);
//      pthread_join(thread_id, NULL); //27s

      std::future<void *> f[12];
//      f[0] = std::async(std::launch::async, testNN, nullptr);   // 27s
      // for multithreaded testing:
//    f[1] = std::async(std::launch::async, testNN, nullptr);
//    f[2] = std::async(std::launch::async, testNN, nullptr);
//    f[3] = std::async(std::launch::async, testNN, nullptr);
//    f[4] = std::async(std::launch::async, testNN, nullptr);
//    f[5] = std::async(std::launch::async, testNN, nullptr);
//    f[6] = std::async(std::launch::async, testNN, nullptr);
//    f[7] = std::async(std::launch::async, testNN, nullptr);
//    f[8] = std::async(std::launch::async, testNN, nullptr);
//    f[9] = std::async(std::launch::async, testNN, nullptr);
//    f[10] = std::async(std::launch::async, testNN, nullptr);
//    f[11] = std::async(std::launch::async, testNN, nullptr);

    }

    time_t runTime = time(nullptr);
    runTime -= start;

    printf("calc done in %lds (%ld calc/s)\n", runTime, TEST_ITERATIONS / runTime);

}

I compile with

# g   -std=c  11 test.cpp  -o test -lpthread

and results for function call, pthread and std::async respectively:

# time ./test
calc done in 12s (83333333 calc/s)

real    0m12.073s
user    0m12.070s
sys     0m0.003s

# time ./test
calc done in 27s (37037037 calc/s)

real    0m26.741s
user    0m26.738s
sys     0m0.004s

# time ./test
calc done in 27s (37037037 calc/s)

real    0m26.788s
user    0m26.785s
sys     0m0.003s

P.S. I'm still not sure if I want to use C 11. I used C 11 just to test if there is going to be a difference between plain phread and std::async.

CodePudding user response：

Thanks to @AndreasWenzel I found out rand() is causing the slow down. In theory it shouldn't be a problem when only one thread is running (or at least no other thread is calling rand). Replacing rand() with rand_r() fixes the problem and even brings down the time to 8s for the same amount of work. Here is the test function:

void *testNN(void *dummy) {
  volatile double x;
  unsigned int seed = (unsigned int) time(nullptr);


  for (long i = 0; i < TEST_ITERATIONS;   i) {
    x = rand_r(&seed);
    x *= rand_r(&seed);
  }
  return nullptr;
}

I know this is not ideal - starting 12 threads will most likely seed all the threads with the same number, but that's just a test. I'll most likely have more complex seed function.