Best practice for waiting for a child process termination in C-CodePudding

I am writing a C library that at some point forks another process and then waits for its completion.

I'd like to write the code that waits for the child process completion in the most robust and generic way, to take care of all possible scenarios, such as the calling process spawning other child processes, receiving signals etc.

Does the following C code use waitpid properly, i.e. in the most robust way?

void waitForChildProcess(int child_pid) {
    int rc, err;
    do {
        //waiting only for my own child and only for its termination.
        //The status value is irrelevant (I think) because option '0' should mean 
        //to only wait for a child termination event 
        // and I don't care about the child's exit code:
        rc = waitpid(child_pid, NULL, 0);
        err = errno;
    } while (rc == -1 && err == EINTR); //ignoring a signal
}

CodePudding user response：

I'd like to write the code that waits for the child process completion in the most robust and generic way.

A child process is created by fork syscall. The worst case scenario is that SIGCHLD is delivered to the parent process before fork returns. The default signal action for SIGCHLD is to ignore the signal, so that the subsequent waitpid call hangs indefinitely.

The robust POSIX way to handle termination of child processes in any/multi-threaded program is:

The main thread blocks SIGCHLD using sigprocmask/pthread_sigmask before any extra threads are created. Child threads inherit the signal mask of the parent thread. In other words, main function should block the signal earliest. (Unless your global C object constructor functions or platform specific constructior functions spawn new threads before main is entered, but that's outside of the scope/requirements of the C standard, or any platform specific standard, to my knowledge. glibc may even hang forever if new threads are created before main is entered, and that has been a long standing bug of glibc).
Once child processes are created, one thread must call sigwait or sigwaitinfo to recieve a SIGCHLD that has been pending, if any, or wait for it. No signal loss is possible in this case.

See sigwaitinfo for full description of the issues mentioned here and the solution.

Also see pthread_sigmask example called "Signaling in a Multi-Threaded Process".

Another POSIX option is that a SIGCHLD signal handler is installed before fork is called. Insider the signal handler only a small subset of async-signal-safe functions can be called. That is often too restrictive, so that self-pipe trick is used to delegate signal processing to a non-signal context. Some other thread reads that pipe from the signal handler and handles the signal in the "normal" non-signal context

Linux provides signalfd syscall that essentially does the self-pipe trick for you, and this is the least tricky and most robust way to handle signals.

CodePudding user response：

Yes, waitpid(child_pid, ...) is the most robust way.

It will return child_pid if the child process has exited, -1 with errno set if an error occurs (ECHILD if the child process does not exist (was never created or has already been reaped) or is not a child of this process, EINVAL if the options (third parameter) had an invalid value, or EINTR if a signal was delivered to a signal handler that was not installed with SA_RESTART flags), or 0 if WNOHANG option (third parameter) was specified and the child process has not yet exited.

I would recommend a slight change, however:

/* Wait for child process to exit.
 * @child_pid   Process ID of the child process
 * @status      Pointer to where the child status
 *              is stored; may be NULL
 * @return       0  if success
 *              -1  if an error occurs, see errno.
*/
int waitForChildProcess(pid_t child_pid, int *status)
{
    int rc;

    if (child_pid <= 1) {
        errno = EINVAL;
        return -1;
    }

    do {
        rc = waipid(child_pid, status, 0);
    } while (rc == -1 && errno == EINTR);
    if (rc == child_pid)
        return 0;

    /* This should not happen, but let's be careful. */
    if (rc != -1)
        errno = ECHILD;

    return -1;
}

In Linux and POSIXy systems, process ID's are positive integers. As you can see in the man 2 waitpid man page, zero and negative PIDs refer to process groups, and -1 to any child process. Process 1 is special, init; it is the one that never exits and sets up the rest of the userspace. So, the smallest PID a child of the current process can ever have is 2.

I do consider it sensible to use the proper types for these: pid_t for process IDs, and for example size_t for memory sizes of objects (including the return value of say strlen().)

Providing the status pointer (so that the caller can check it with WIFEXITED() WEXITSTATUS() or WIFSIGNALED() WTERMSIG()) is a convenience, since any callers not interested in it can provide a NULL. (NULL is explicitly allowed for the status pointer for wait() and waitpid().)

Technically, with options==0, waitpid() should only ever return either the child PID, or -1 (with errno set). However, since the check is so cheap, I prefer to treat everything else as an ECHILD error, since that gives the most robust results.

The caller is free to ignore the return value. However, if they want to know, the return value is 0 if successful, otherwise -1 with errno set (and strerror(errno) provides the textual reason).