I'm trying to open and read all proc/[pid]/stat file's content.
But I got ENOENT error from open func when pid >= 10961.
From proc manpage I found this :
In addition, if a process becomes a zombie (one that has been terminated by its parent with an exit call but has not been suspended by a wait call), most of its associated /proc files disappear from the directory structure. Normally, later attempts to open or to read or write to files that are opened before the process is terminated elicit the ENOENT message.
But I can still see content of file using cat
command through path that occurs ENOENT.
This is confusing. Is this Zombie process or not? Why can't I open it?
code
void get_stat(char *path)
{
int fd;
char *res;
printf("path : %s\n", path);
fd = open(path, O_RDONLY);
if (fd < 0)
{
perror("open error");
exit(EXIT_FAILURE);
}
res = read_file(fd);
}
output
... worked fine before 10961 ...
path : /proc/6215/stat
path : /proc/6354/stat
path : /proc/10961/stat
open error: No such file or directory
path : /proc/12049/stat
open error: No such file or directory
path : /proc/12127/stat
open error: No such file or directory
path : /proc/12168/stat
open error: No such file or directory
path : /proc/12169/stat
open error: No such file or directory
path : /proc/12171/stat
open error: No such file or directory
path : /proc/12230/stat
open error: No such file or directory
path : /proc/12238/stat
open error: No such file or directory
path : /proc/13185/stat
open error: No such file or directory
path : /proc/13284/stat
open error: No such file or directory
path : /proc/13285/stat
open error: No such file or directory
path : /proc/13466/stat
open error: No such file or directory
path : /proc/13522/stat
open error: No such file or directory
path : /proc/13523/stat
open error: No such file or directory
path : /proc/13532/stat
open error: No such file or directory
path : /proc/13579/stat
open error: No such file or directory
path : /proc/13580/stat
open error: No such file or directory
path : /proc/13589/stat
open error: No such file or directory
path : /proc/13636/stat
open error: No such file or directory
path : /proc/13637/stat
open error: No such file or directory
path : /proc/13726/stat
open error: No such file or directory
path : /proc/14416/stat
open error: No such file or directory
path : /proc/15059/stat
open error: No such file or directory
path : /proc/15153/stat
open error: No such file or directory
path : /proc/15255/stat
open error: No such file or directory
path : /proc/15571/stat
open error: No such file or directory
path : /proc/15573/stat
open error: No such file or directory
path : /proc/15603/stat
open error: No such file or directory
path : /proc/15697/stat
open error: No such file or directory
path : /proc/15744/stat
open error: No such file or directory
path : /proc/15771/stat
open error: No such file or directory
path : /proc/15790/stat
open error: No such file or directory
EDIT : I get same result with sudo command.
I made path string through opendir/readdir func in same program. It only knows existing directory.
So some processe existed when getting their path name, and then vanished, and reappered.
Through cat, I found many of them have name 'kworker'.
Maybe that kind of process can behave like that? I'm searching now..
CodePudding user response:
You need first to check that you are running with the super user access rights to access all the processes. But If you were running this issue, you would get EPERM error I guess.
Anyway, we can't reliably access all the files in /proc/ as processes are appearing and disappearing during the system life. Even the program you launch, creates an entry in /proc and makes it disappear when it finishes...
The reason why you get the error for pids bigger than a certain value comes from the fact any newly running process gets a incrementing pid value. The maximum value before rolling back to 0 is in /proc/sys/kernel/pid_max. And the current last used pid number is /proc/sys/kernel/ns_last_pid. Moreover, low pid values are typically long running processes...
PS: A dead process may still exist in a sort of zombie state after it terminates to make available its termination status to the reaper process which is:
- Either its father if the latter is still running (it is supposed to get the status with a call to
wait()
) - Or the init process if the father is dead.
Consider the following program where a process creates a child process. The latter terminates (it merely calls exit(2)
). But the father process does not reap it as it enters an infinite wait calling pause()
:
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
int main(void)
{
if (fork()==0) {
// child process
exit(2);
}
pause();
return 0;
}
Compile it and run it in background:
$ gcc pf.c -o pf
$ ./pf &
[3] 29979
The ps
command still shows the child process (pid = 29980) which is dead:
$ ps -ef
[...]
xxx 29979 17898 0 20:19 pts/0 00:00:00 ./pf
xxx 29980 29979 0 20:19 pts/0 00:00:00 [pf] <defunct>
xxx 29983 17898 0 20:20 pts/0 00:00:00 ps -ef
The kernel waits the father process to reap the process but as the latter is waiting indefinitely, it does not do the job (status collection with a call to wait()
). Hence, the child process is in the zombie state (Z letter in the 3rd field of stat file):
$ cat /proc/29980/stat
29980 (pf) Z 29979 29979 17898 34816 29991 4227140 18 0 0 0 0 0 0 0 20 0 1 0 3701092 0 0 18446744073709551615 0 0 0 0 0 0 0 0 0 0 0 0 17 3 0 0 0 0 0 0 0 0 0 0 0 0 512
If you kill the father process, the child process will be attached to the init process which will reap it. Meanwhile, it your program tries to read the stat file, it may encounter the ENOENT error because of the reaper which did the status collection and triggered the removing of the /proc entry.