I'm testing running double calculations in a thread and I got this strange result. Running the calculations in the main thread takes almost half the time than running it in a separate thread and calling join in the main thread. If it's a single thread there shouldn't be a big difference from just running the function. Am I doing something wrong?
The cpu is Intel Xeon E-2136 limited at 4.1GHz to have the same boost frequency in independent of how many cores are running.
#include <cstdio>
#include <stdexcept>
#include <thread>
#include <future>
#include <malloc.h>
#include <time.h>
#define TEST_ITERATIONS 1000*1000*1000
void *testNN(void *dummy) {
volatile double x;
for (int i = 0; i < TEST_ITERATIONS; i) {
x = rand();
x *= rand();
}
return nullptr;
}
int main(){
time_t start = time(nullptr);
{ // for future to join thread
testNN(nullptr); // 12s
// pthread_t thread_id;
// pthread_create(&thread_id, NULL, testNN, nullptr);
// pthread_join(thread_id, NULL); //27s
std::future<void *> f[12];
// f[0] = std::async(std::launch::async, testNN, nullptr); // 27s
// for multithreaded testing:
// f[1] = std::async(std::launch::async, testNN, nullptr);
// f[2] = std::async(std::launch::async, testNN, nullptr);
// f[3] = std::async(std::launch::async, testNN, nullptr);
// f[4] = std::async(std::launch::async, testNN, nullptr);
// f[5] = std::async(std::launch::async, testNN, nullptr);
// f[6] = std::async(std::launch::async, testNN, nullptr);
// f[7] = std::async(std::launch::async, testNN, nullptr);
// f[8] = std::async(std::launch::async, testNN, nullptr);
// f[9] = std::async(std::launch::async, testNN, nullptr);
// f[10] = std::async(std::launch::async, testNN, nullptr);
// f[11] = std::async(std::launch::async, testNN, nullptr);
}
time_t runTime = time(nullptr);
runTime -= start;
printf("calc done in %lds (%ld calc/s)\n", runTime, TEST_ITERATIONS / runTime);
}
I compile with
# g -std=c 11 test.cpp -o test -lpthread
and results for function call, pthread and std::async respectively:
# time ./test
calc done in 12s (83333333 calc/s)
real 0m12.073s
user 0m12.070s
sys 0m0.003s
# time ./test
calc done in 27s (37037037 calc/s)
real 0m26.741s
user 0m26.738s
sys 0m0.004s
# time ./test
calc done in 27s (37037037 calc/s)
real 0m26.788s
user 0m26.785s
sys 0m0.003s
P.S. I'm still not sure if I want to use C 11. I used C 11 just to test if there is going to be a difference between plain phread and std::async.
CodePudding user response:
Thanks to @AndreasWenzel I found out rand() is causing the slow down. In theory it shouldn't be a problem when only one thread is running (or at least no other thread is calling rand). Replacing rand() with rand_r() fixes the problem and even brings down the time to 8s for the same amount of work. Here is the test function:
void *testNN(void *dummy) {
volatile double x;
unsigned int seed = (unsigned int) time(nullptr);
for (long i = 0; i < TEST_ITERATIONS; i) {
x = rand_r(&seed);
x *= rand_r(&seed);
}
return nullptr;
}
I know this is not ideal - starting 12 threads will most likely seed all the threads with the same number, but that's just a test. I'll most likely have more complex seed function.