I am writing a C library that at some point forks another process and then waits for its completion.
I'd like to write the code that waits for the child process completion in the most robust and generic way, to take care of all possible scenarios, such as the calling process spawning other child processes, receiving signals etc.
Does the following C code use waitpid
properly, i.e. in the most robust way?
void waitForChildProcess(int child_pid) {
int rc, err;
do {
//waiting only for my own child and only for its termination.
//The status value is irrelevant (I think) because option '0' should mean
//to only wait for a child termination event
// and I don't care about the child's exit code:
rc = waitpid(child_pid, NULL, 0);
err = errno;
} while (rc == -1 && err == EINTR); //ignoring a signal
}
CodePudding user response:
I'd like to write the code that waits for the child process completion in the most robust and generic way.
A child process is created by fork
syscall. The worst case scenario is that SIGCHLD
is delivered to the parent process before fork
returns. The default signal action for SIGCHLD
is to ignore the signal, so that the subsequent waitpid
call hangs indefinitely.
The robust POSIX way to handle termination of child processes in any/multi-threaded program is:
- The main thread blocks
SIGCHLD
usingsigprocmask/pthread_sigmask
before any extra threads are created. Child threads inherit the signal mask of the parent thread. In other words,main
function should block the signal earliest. (Unless your global C object constructor functions or platform specific constructior functions spawn new threads beforemain
is entered, but that's outside of the scope/requirements of the C standard, or any platform specific standard, to my knowledge.glibc
may even hang forever if new threads are created beforemain
is entered, and that has been a long standing bug ofglibc
). - Once child processes are created, one thread must call
sigwait
orsigwaitinfo
to recieve aSIGCHLD
that has been pending, if any, or wait for it. No signal loss is possible in this case.
See sigwaitinfo
for full description of the issues mentioned here and the solution.
Also see pthread_sigmask
example called "Signaling in a Multi-Threaded Process".
Another POSIX option is that a SIGCHLD
signal handler is installed before fork
is called. Insider the signal handler only a small subset of async-signal-safe functions can be called. That is often too restrictive, so that self-pipe trick is used to delegate signal processing to a non-signal context. Some other thread read
s that pipe from the signal handler and handles the signal in the "normal" non-signal context
Linux provides signalfd
syscall that essentially does the self-pipe trick for you, and this is the least tricky and most robust way to handle signals.
CodePudding user response:
Yes, waitpid(child_pid, ...) is the most robust way.
It will return child_pid if the child process has exited, -1 with errno
set if an error occurs (ECHILD
if the child process does not exist (was never created or has already been reaped) or is not a child of this process, EINVAL
if the options (third parameter) had an invalid value, or EINTR
if a signal was delivered to a signal handler that was not installed with SA_RESTART flags), or 0
if WNOHANG
option (third parameter) was specified and the child process has not yet exited.
I would recommend a slight change, however:
/* Wait for child process to exit.
* @child_pid Process ID of the child process
* @status Pointer to where the child status
* is stored; may be NULL
* @return 0 if success
* -1 if an error occurs, see errno.
*/
int waitForChildProcess(pid_t child_pid, int *status)
{
int rc;
if (child_pid <= 1) {
errno = EINVAL;
return -1;
}
do {
rc = waipid(child_pid, status, 0);
} while (rc == -1 && errno == EINTR);
if (rc == child_pid)
return 0;
/* This should not happen, but let's be careful. */
if (rc != -1)
errno = ECHILD;
return -1;
}
In Linux and POSIXy systems, process ID's are positive integers. As you can see in the man 2 waitpid man page, zero and negative PIDs refer to process groups, and -1 to any child process. Process 1 is special, init
; it is the one that never exits and sets up the rest of the userspace. So, the smallest PID a child of the current process can ever have is 2.
I do consider it sensible to use the proper types for these: pid_t
for process IDs, and for example size_t
for memory sizes of objects (including the return value of say strlen()
.)
Providing the status
pointer (so that the caller can check it with WIFEXITED()
WEXITSTATUS()
or WIFSIGNALED()
WTERMSIG()
) is a convenience, since any callers not interested in it can provide a NULL
. (NULL
is explicitly allowed for the status pointer for wait()
and waitpid()
.)
Technically, with options==0
, waitpid()
should only ever return either the child PID, or -1 (with errno
set). However, since the check is so cheap, I prefer to treat everything else as an ECHILD error, since that gives the most robust results.
The caller is free to ignore the return value. However, if they want to know, the return value is 0 if successful, otherwise -1 with errno
set (and strerror(errno)
provides the textual reason).