Home > OS >  My thread pool triggers race conditions but i can't figure out why
My thread pool triggers race conditions but i can't figure out why

Time:10-25

Below is my thread pool code. after 3 hours' debugging, i turn to you guys for help ToT.
Q1: Is there some wrong with my code? It came up some race conditions while i executed this code
Q2: The sub-threads did not execute before I add sleep() function in my main function, I want to figure that out too.
PS: I executed this code under unbuntu system.

//this is thread_pool.h
#ifndef _THREAD_POLL_H_
#define _THREAD_POLL_H_

#include <list>
#include <vector>
#include <unistd.h>
// #include "locker.h"
#include <pthread.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <semaphore.h>

class thread_pool {
private:
    struct task {
        void (*fun)(void*);
        void* arg;
    };
private:
    int size;       //the num of working threads
    pthread_mutex_t lock;       //a mutex
    sem_t sem;    //semaphore to indicate the num of jobs
    std::list<task> tasks;
    std::vector<pthread_t> threads; 

    const int default_size = 16;
    int is_shutdown;
public:
    thread_pool();
    thread_pool(int num);
    ~thread_pool();
    void add_job(void (*fun)(void*), void* arg);
private:
    static void* work(void* arg);   //the working threads' call back
    void run();                     //the actual function that work() calls
};

#endif
//this is thread_pool.cpp
#include "thread_pool.h"


thread_pool::thread_pool(int num) : size(num), is_shutdown(0) {
    if (num <= 0) {
        fprintf(stderr, "the num of working threads is incorrect\n");
        exit(1);
    }
    pthread_mutex_init(&lock, NULL);
    sem_init(&sem, 0, 0);
    threads.resize(num);
    pthread_mutex_lock(&lock);
    for (int i = 0; i < num; i  ) {
        pthread_create(&threads[i], NULL, work, this);
        // printf("thread %d is created\n", threads[i]);
        pthread_detach(threads[i]);
    }    
    pthread_mutex_unlock(&lock);
}

thread_pool::thread_pool() : thread_pool(default_size) {}

thread_pool::~thread_pool() {
    is_shutdown = 1;
    for (int i = 0; i < size; i  )
        sem_post(&sem);
    pthread_mutex_destroy(&lock);
    sem_destroy(&sem);
}

void* thread_pool::work(void* arg) {
    thread_pool* pool = (thread_pool*)arg;
    pool->run();
    return pool;
}

void thread_pool::run() {
    while (true) {
        sem_wait(&sem);
        if (is_shutdown) {
            break;
        }

        pthread_mutex_lock(&lock);
        if (tasks.empty()) {
            pthread_mutex_unlock(&lock);
            continue;
        }
        // printf("thread %d run\n", pthread_self());
        task tmp = tasks.front();
        tasks.pop_front();
        pthread_mutex_unlock(&lock);
        tmp.fun(tmp.arg);
    }
}

void thread_pool::add_job(void (*fun)(void*), void* arg) {
    pthread_mutex_lock(&lock);
    task tmp;
    tmp.fun = fun;
    tmp.arg = arg;
    tasks.push_back(tmp);
    sem_post(&sem);
    pthread_mutex_unlock(&lock);
}

below is the minimal reproducible example, when executing the main function, the output of the fun contains the same number.

#include <stdio.h>
#include "thread_pool.h"

int idx = 0; 

void func(void* arg) {
    printf("%d\n", *(int*)arg);
    usleep(100);
}

int main() {
    thread_pool tp(8);
    while (1) {
        tp.add_job(func, (void*)&idx);
        idx  ;
    }
} 

CodePudding user response:

I cannot answer your Q1. As some comments point out, it would help if you could provide a minimal reproducible example and a description of the exact error you get.

However, your Q2 is fairly simple: Your program shuts down when main returns. If you have started a bunch of threads, then too bad - they will also get torn down by the OS, and most likely in a brutal and exception-inducing fashion.

The normal way around this is to join the threads you have created at the end of main (or wherever it makes sense). This causes the main thread to wait for the spawned threads to finish.

You instead detach your threads, which means that they clean up after themselves when they finish, but it does nothing for extending the life of your main thread. So they still get brutally stopped when that one returns.

The sleep you mention adding to main simply pushes main to survive longer, and thus lets the spawned threads run a bit. At the end of the sleep main still stops and your threads get forcefully torn down (if they are still running).

CodePudding user response:

After 2 days' debugging, I find out the bug and fix it.

Q1: The reason why my code triggers race conditions, like the minimal reprodicible example main, is because when passing a job to the thread pool, the argument arg points to the memory which changes all the time. So, these race conditions can be easily fixed using deep copy.

Q2: The cause of this bug may be time slices are not allocated for sub-threads. I am not famaliar with Linux scheduling algorithm, so I added sleep function in the main thread to force the main thread to sleep so that sub-threads can have the chance to execute.

  • Related