Home > Net >  Properly reaping all child processes and collecting exit status
Properly reaping all child processes and collecting exit status

Time:04-14

I want to catch all child processes forked by a parent process, then collect the last child's exit status. To that end, I called sigsuspend() to wait for a SIGCHLD signal. When I receive the SIGCHLD signal, then the handler will call waitpid in a loop until it indicates there are no children left to reap. The exit status will be set, and the main will break out of the loop and terminate.

However, I noticed that this is not correct, as all the children aren't always reaped. How can I fix this?

#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <sys/wait.h>

volatile sig_atomic_t exit_stat;

// Signal Handler
void sigchld_handler(int sig) {
    pid_t pid;
    int status;
    while(1) {  
        pid = waitpid(-1, &status, WNOHANG);
        if(pid <= 0) {break;}
        if(WIFEXITED(status)) {
            printf("%s", "Exited correctly.");
        }
        else {
            printf("%s", "Bad exit.");
        }
    }
    exit_stat = status;
}


// Executing code.
int main() {    
    signal(SIGCHLD, sigchld_handler);
    
    sigset_t mask_child;
    sigset_t old_mask;
    sigemptyset(&mask_child);
    sigaddset(&mask_child, SIGCHLD);
    sigprocmask(SIG_BLOCK, &mask_child, &old_mask);
    
    for(int i = 0; i < 5; i  ) {
        int child_pid = fork();
        if(child_pid != 0) {
            //Perform execvp call.
            char* argv[] = {"echo", "hi", NULL};
            execvp(argv[0], argv);
        }
    }
    
    while(!exit_stat) {
        sigsuspend(&old_mask);
    }
    
    return 0;
}

CodePudding user response:

Transferring lightly modified comments into an answer.

The WNOHANG option to waitpid() means "return immediately if there are no children left, OR if there are children left but they're still running". If you really want to wait for all children to exit, either omit the WNOHANG option to waitpid() or simply use wait() instead. Note that if there were tasks launched in the background, they may not terminate for a very long time, if ever. It also depends on the context whether 'the last child to die' is the correct one to report on. It is possible to imagine scenarios where that is not appropriate.

You're right, in this instance, I meant that "the last child to die" is the last child that was forked. Can I fix this by adding a simple condition to check if the returned pid of wait == the pid of the last forked child?

If you're interested in the last child in the most recent pipeline (e.g. ls | grep … | sort … | wc and you want to wait for wc), then you know the PID for wc, and you can use waitpid(wc_pid, &status, 0) to wait for that process specifically to die. Or you can use your loop to collect bodies until you either find the body of wc or get 'no dead processes left'. At that point, you can decide to wait specifically for the wc PID, or (better) use waitpid() without WNOHANG (or use wait()) until some process dies — and again you can decide whether it was wc or not, and if not, repeat the WNOHANG corpse collection process to collect any zombies. Repeat until you do find the corpse of wc.

And also, you said that background tasks may not terminate for a long time. By this, do you mean that waitpid(-1, &status, 0) will completely suspend all processes until a child is ready to be reaped?

waitpid(-1, &status, 0); will make the parent process wait indefinitely until some child process dies, or it will return because there are no children left to wait for (which indicates there was a housekeeping error; children should not die without the parent knowing).

Note that using a 'wait for any child' loop avoids leaving zombies around (children that have died but not been waited for). This is generally a good idea. But capturing when the child you're currently interested in dies ensures that your shell doesn't hang around waiting when it wasn't necessary. So, you need to capture both the PID and the exit status of the dead child processes.

  • Related