I'm using execvp in my code and, the way I understand it, it directly invokes the command it is passed, unlike popen or system, which also invoke a shell.
But, I also understand that execvp inspects the PATH
variable to look for the command it is passed in various directories, and this is also why I use it, so I don't have to pass the whole path of the program.
So how does execvp inspect the PATH
variable if it does not even invoke a shell? My intuition tells me it could be because every process after logging in is an indirect child of a shell session, but I might be wrong.
CodePudding user response:
Environment Variables and Shell Variables Are Different
One common misconception, perhaps one on which this question is partially based, is that all shell variables are "environment variables".
This is not true: Shells have regular shell variables, stored in heap memory, not propagated across process boundaries, and otherwise local to that shell process; and they have environment variables, which have a copy maintained in the block of per-process memory managed by the kernel. There's a limit on the space available in this block (typically shared with command-line argument length -- more/larger environment variables means there's less space for command-line arguments), so allocating environment variables is "more expensive" than regular shell variables.
Environment Variables Are Copied To Child Processes On Creation, And Accessible By Non-Shell Software
When shell variables that are flagged as exported change, a typical shell written in C uses setenv()
or putenv()
to copy the new value into the process-information space.
This block of space is, like all other process state, copied on fork()
, and preserved through exec*()
calls that do not explicitly overwrite it; which is how environment variables are inherited by subprocesses.
The exec
specification linked above has detailed information on how environment variables are passed between and represented within C programs; reading it is strongly recommended.
Any program can call getenv()
(or implement similar functionality as the libc getenv
implementation includes) to read the value of these variables.
Consequently, execvp()
can be implemented in pure C, with no shell involved anywhere.
Typically this is a libc wrapper around a lower-level call like execve
that directly instructs the kernel to replace the current process image with a new one; but one could also implement it directly in the kernel, as the kernel has access to the block of memory with process information that it handed off to its child, and can retrieve environment variables from there.
CodePudding user response:
how does execvp inspect the PATH variable if it does not even invoke a shell?
From glibc sources https://github.com/zerovm/glibc/blob/master/posix/execvp.c#L93 first it just gets the variable from environment variables of the current process:
char *path = getenv ("PATH");
Then for each dir in path separated by colon:
path = p;
p = __strchrnul (path, ':');
It adds the "name" of the executable passed to execvp
with the path from PATH
:
startp = (char *) memcpy (name - (p - path), path, p - path);
And then tries to call that to execve
system call with that name:
/* Try to execute this name. If it works, execve will not return. */
__execve (startp, argv, __environ);
If kernel execve
can find the path, it will run it, if it can't, it will fail with ENOENT
:
case ENOENT:
case ESTALE:
case ENOTDIR:
/* Those errors indicate the file is missing or not executable
by us, in which case we want to just try the next path
directory. */
...
break;
The process of "taking a dir from PATH" and "adding it to name passed to execve" and then calling the system call execve
is repeated for each dir in PATH.