I am working on a project where we have used pthread_create to create several child threads. The thread creation logic is not in my control as its implemented by some other part of project.
Each thread perform some operation which takes more than 30 seconds to complete. Under normal condition the program works perfectly fine. But the problem occurs at the time of termination of the program. I need to exit from main as quickly as possible when I receive the SIGINT signal.
When I call exit() or return from main, the exit handlers and global objects' destructors are called. And I believe these operations are having a race condition with the running threads. And I believe there are many race conditions, which is making hard to solve all of theses. The way I see it there are two solutions.
- call _exit() and forget all de-allocation of resources
- When SIGINT is there, close/kill all threads and then call exit() from main thread, which will release resources.
I think 1st option will work, but I do not want to abruptly terminate the process. So I want to know if it is possible to terminate all child threads as quickly as possible so that exit handler & destructor can perform required clean-up task and terminate the program.
I have gone through this post, let me know if you know other ways: POSIX API call to list all the pthreads running in a process
Also, let me know if there is any other solution to this problem
CodePudding user response:
What is it that you need to do before the program quits? If the answer is 'deallocate resources', then you don't need to worry. If you call _exit
then the program will exit immediately and the OS will clean up everything for you.
Be aware also that what you can safely do in a signal hander is extremely limited, so attempting to perform any cleanup yourself is not recommended. If you're interested, there's a list of what you can do here. But you can't flush a file to disk, for example (which is about the only thing I can think of that you might legitimately want to do here). That's off limits.
CodePudding user response:
I need to exit from main as quickly as possible when I receive the SIGINT signal.
How is that defined? Because there's no way to "exit quickly as possible" when you receive one signal like that.
You can either set flag(s), post to semaphore(s), or similar to set a state that tells other threads it's time to shut down, or you can kill the entire process.
If you elect to set flag(s) or similar to tell the other threads to shut down, you set those flags and return from your signal handler and hope the threads behave and the process shuts down cleanly.
If you elect to kill threads, there's effectively no difference in killing a thread, killing the process, or calling _exit()
. You might as well just keep it simple and call _exit()
.
That's all you can chose between when you have to make your decision in a single signal handler call. Pick one.
A better solution is to use escalating signals. For example, when you get SIGQUIT
or SIGINT
, you set flag(s) or otherwise tell threads it's time to clean up and exit the process - or else. Then, say five seconds later whatever is shutting down your process sends SIGTERM
and the "or else" happens. When you get SIGTERM
, your signal handler simply calls _exit()
- those threads had their chance and they messed it up and that's their fault. Or you can call abort()
to generate a core file and maybe provide enough evidence to fix the miscreant threads that won't shut down.
And finally, five seconds later the managing process will nuke the process from orbit with SIGKILL
just to be sure.