I wanted to understand bash builtins. Hence the following questions:
- When I think of the term builtin I am thinking that the
bash
executable has a function defined in its symbol table that other parts of the executable can access without actually having tofork
. Is this what builtin means? - I also see that some builtins have a separate executable. For instance
type [
returns[ is a shell builtin
. But then I also see an executable named/usr/bin/[
. Is it correct to say that the same code is available through two executables: one throughbash
program and another through/usr/bin/[
?
CodePudding user response:
the bash executable has a function defined in its symbol table
There are builtins that are included inside Bash executable. You can load builtins dynamically from a separate shared library on runtime.
can access without actually having to fork
Yes.
Is it correct to say that the same code is available through two executables: one through bash executable and another through /usr/bin/[?
No, it's a different source code. One is a Bash builtin and the other is a program. It will be a different source code. There is also different behavior in grey areas.
$ printf "%q\n" '*'
\*
$ /bin/printf "%q\n" '*'
'*'
$ time echo 1
1
real 0m0.000s
user 0m0.000s
sys 0m0.000s
$ /bin/time echo 1
1
0.00user 0.00system 0:00.00elapsed 50%CPU (0avgtext 0avgdata 2392maxresident)k
64inputs 0outputs (1major 134minor)pagefaults 0swaps
$ [ -v a ]
$ /bin/[ -v a ]
/bin/[: ‘-v’: unary operator expected
CodePudding user response:
Loosely speaking, the program version of the built-ins is used when the shell interpreter is not available or not needed. Let's explain it in more details...
When you run a shell script, the interpreter recognizes the built-ins and will not fork/exec but merely call the function corresponding to the built-in. Even if you call them from an C/C executable through system()
, the latter launches a shell first and then makes the spawn shell run the built-in.
Here is an example program, which runs echo message
thanks to system()
library service:
#include <stdlib.h>
int main(void)
{
system("echo message");
return 0;
}
Compile it and run it:
$ gcc msg.c -o msg
$ ./msg
message
Running the latter under strace
with the -f
option shows the involved processes. The main program is executed:
$ strace -f ./msg
execve("./msg", ["./msg"], 0x7ffef5c99838 /* 58 vars */) = 0
Then, system()
triggers a fork()
which is actually a clone()
system call. The child process#5185 is launched:
clone(child_stack=0x7f7e6d6cbff0, flags=CLONE_VM|CLONE_VFORK|SIGCHLD
strace: Process 5185 attached
<unfinished ...>
The child process executes /bin/sh -c "echo message"
. The latter shell calls the echo
built-in to display the message on the screen (write()
system call):
[pid 5185] execve("/bin/sh", ["sh", "-c", "echo message"], 0x7ffdd0fafe28 /* 58 vars */ <unfinished ...>
[...]
[pid 5185] write(1, "message\n", 8message
) = 8
[...]
exited with 0
The program version of the built-ins is useful when you need them from a C/C executable without an intermediate shell for the sake of the performances. For instance, when you call them through execv()
function.
Here is an example program which does the same thing as the preceding example but with execv()
instead of system()
:
#include <unistd.h>
int main(void)
{
char *av[3];
av[0] = "/bin/echo";
av[1] = "message";
av[2] = NULL;
execv(av[0], av);
return 0;
}
Compile and run it to see that we get the same result:
$ gcc msg1.c -o msg1
$ ./msg1
message
Let's run it under strace
to get the details. The output is shorter because no sub-process is involved to execute an intermediate shell. The actual /bin/echo
program is executed instead:
$ strace -f ./msg1
execve("./msg1", ["./msg1"], 0x7fffd5b22ec8 /* 58 vars */) = 0
[...]
execve("/bin/echo", ["/bin/echo", "message"], 0x7fff6562fbf8 /* 58 vars */) = 0
[...]
write(1, "message \1\n", 10message
) = 10
[...]
exit_group(0) = ?
exited with 0
Of course, if the program is supposed to do additional things, a simple call to execv()
is not sufficient as it overwrites itself by the /bin/echo
program. A more elaborated program would fork and execute the latter program but without the need to run an intermediate shell:
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
int main(void)
{
if (fork() == 0) {
char *av[3];
av[0] = "/bin/echo";
av[1] = "message";
av[2] = NULL;
execv(av[0], av);
}
wait(NULL);
// Make additional things before ending
return 0;
}
Compile and run it under strace
to see that the intermediate child process executes the /bin/echo
program without the need of an intermediate shell:
$ gcc msg2.c -o msg2
$ ./msg2
message
$ strace -f ./msg2
execve("./msg2", ["./msg2"], 0x7ffc11a5e228 /* 58 vars */) = 0
[...]
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 5703 attached
, child_tidptr=0x7f8e0b6e0810) = 5703
[pid 5703] execve("/bin/echo", ["/bin/echo", "message"], 0x7ffe656a9d08 /* 58 vars */ <unfinished ...>
[...]
[pid 5703] write(1, "message\n", 8message
) = 8
[...]
[pid 5703] exited with 0
<... wait4 resumed>NULL, 0, NULL) = 5703
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5703, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
exit_group(0) = ?
exited with 0