Home > other >  How do the err/warn functions get the program name?
How do the err/warn functions get the program name?

Time:05-29

From the err/warn manpage:

The  err()  and  warn()  family of functions display a formatted error
message on the standard error output.  In all cases, the last component
of the program name, a colon character, and a space are output.  If the
fmt argument is not NULL, the printf(3)-like formatted error message is
output.

If I make this call: warn("message"); it will output something like this:

a.out: message: (output of strerror here)

How do the warn/err functions find the name of the program (in this case, a.out) without seemingly having any access to argv at all? Does it have anything to do with the fact that they are BSD extensions?

CodePudding user response:

How do the warn/err functions find the name of the program (in this case, a.out) without seemingly having any access to argv at all? Does it have anything to do with the fact that they are BSD extensions?

Such things can easily be figured out using the strace utility, which records all system calls.

I wrote the highly complex program test.c:

#include <err.h>
int main() { warn("foo"); }

and gcc -o test -static test.c; strace ./test yields (the -static to avoid the noise from trying to load a lot of libraries):

execve("./test", ["./test"], 0x7fffcbb7fd60 /* 101 vars */) = 0
arch_prctl(0x3001 /* ARCH_??? */, 0x7ffd388d6540) = -1 EINVAL (Invalid argument)
brk(NULL)                               = 0x201e000
brk(0x201edc0)                          = 0x201edc0
arch_prctl(ARCH_SET_FS, 0x201e3c0)      = 0
set_tid_address(0x201e690)              = 55889
set_robust_list(0x201e6a0, 24)          = 0
uname({sysname="Linux", nodename="workhorse", ...}) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
readlink("/proc/self/exe", "/tmp/test", 4096) = 9
getrandom("\x43\xff\x90\x4b\xa8\x82\x38\xdd", 8, GRND_NONBLOCK) = 8
brk(0x203fdc0)                          = 0x203fdc0
brk(0x2040000)                          = 0x2040000
mprotect(0x4b6000, 16384, PROT_READ)    = 0
write(2, "test: ", 6)                   = 6
write(2, "foo", 3)                      = 3
write(2, ": Success\n", 10)             = 10
exit_group(0)                           = ?
    exited with 0    

And there you have it: you can just readlink /proc/self/exe to know what you're called.

CodePudding user response:

The err/warn functions prepend the basename of the program name. According to the answers to this SO post, there are a few ways to get the program name without access to argv.

One way is to call readlink on /proc/self/exe, then call basename on that. A simple program that demonstrates this:

#include <libgen.h>
#include <linux/limits.h>
#include <stdio.h>
#include <unistd.h>

char *
progname(void)
{
    char path[PATH_MAX];

    if (readlink("/proc/self/exe", path, PATH_MAX) == -1)
        return NULL;

    /* not sure if a basename-returned string should be
     * modified, maybe don't use this in production */
    return basename(path);
}

int
main(void)
{
    printf("%s: this is my fancy warn message!\n", progname());
    return 0;
}

You can also use the nonstandard variables __progname, which may not work depending on your compiler, and program_invocation_short_name, which is a GNU extension defined in errno.h.

CodePudding user response:

In pure standard C, there's no way to get the program name passed as argv[0] without getting it, directly or indirectly, from main. You can pass it as an argument to functions, or save it in a global variable.

But system functions also have the option of using system-specific methods. On open-source operating system, you can download the source code and see how it's done. For Unix-like systems, that's libc.

For example, on FreeBSD:

  • The warn and err functions call the internal system function _getprogname().
  • _getprogname() reads the global variable __progname.
  • __progname is set in handle_argv which is called from _start(). This code is not in libc, but in CSU, which is a separate library containing program startup code.
  • _start() is the program's entry point. It's defined in the architecture-specific crt1*.c. It's also the function that calls main, and it passes the same argv to both handle_argv() and main().
  • _start is the first C function called in the program. It's called from assembly code that reads the argv pointer from the stack.
  • The program arguments are copied into the program's address space by the kernel as part of the implementation of the execve system call.

Note that there are several concepts of “program name” and they aren't always equivalent. See Finding current executable's path without /proc/self/exe for a discussion of the topic.

  •  Tags:  
  • c bsd
  • Related